Frequently Asked Questions about using the ALSWH data (last updated: 13 April, 2017)

Xenia Dolja-Gore, Peta Forder, David Fitzgerald, Carl Holder, Richard Hockey, Jeeva Kanesarajah, Michael Waller

Data Management

Getting started with the ALSWH data

·         Important information you many need to know as a data user can be found on our website at .  On this webpage you will find information about the surveys, getting started, responses for each of the items asked in the surveys (Data Books) and information on how variables have been derived (Data Dictionary and Data Dictionary Supplement) just click on the links to learn more.  Importantly please read the following prior to getting started:

o   Notes for Collaborators has useful information for first time users of the ALSWH data  ( ).  This webpage gives an introduction to each cohort, the study representativeness and attrition and notes relating to naming conventions for datasets and variables and missing data.

o   Variable names, labels and formats for each cohort and survey can be accessed by clicking on the ‘Survey Variables’ option or directly by using this web address 

  • You can weight for area of residence at Survey 1 (y1wtarea, m1wtarea, o1wtarea) in all crosstabs, frequencies and analyses to adjust for the initial deliberate oversampling in rural and remote areas. This is not required when running models that include area of residence.

·         Check the data map, the data dictionary and Data Dictionary Supplement for further information about survey items and derived variables. They are available at

·         Data must be downloaded and stored onto a secured environment as soon as it is received.  Analysis undertaken must only be in accordance with the approved EOI.  Changes to the nature of the analysis must be approved by the ALSWH PSA committee.

·         All publications must include the appropriate acknowledgments. (

Linking Administrative dataset to ALSWH

·         If your project includes linkage of datasets, use the ‘IDproj’ ’ key variable for joining the datasets.

·         You should have the list of women opting out of the data linkage project(s).  These women will not be in the linked data and should be considered. 

·         Make sure when merging your survey and administrative datasets you ensure only consented women have been added to the combined outcome file.  Note:  in some cases all participants are required for the analysis but this should be confirmed by your ALSWH liaison.

·         Useful notes for data users linking the PBS and MBS data may be found in Tech Report 38 – December 2015 page 118  ( and Tech Report 39 page 71 ( ).

·         Medicare variable formats may be found on the ALSWH website at

·         Dummy PBS and MBS data are available for testing and development here: . Information regarding these data is available here: .

Missing data issues

·         Some participants completed a short survey instead of the full survey, accounting for some missing data. This occurred in Survey 2 for the three original cohorts and Survey 3 for the 1921-26 and 1946-51 cohorts.  The variable ‘**survey’,  has the value 2 for a short survey and one otherwise..

Filling in Missing data

·         Handy references at

·         References for representativeness and attrition may be found at

·         Comments on survey missing data:

Statistics Analysis

Useful programming code

·         Reliable programming code to join multiple files may be found for SAS and Stata programmers at the following webpages:

o   Useful SAS code clearly explained by Wieczkowski, Michael J. Alternatives to Merging SAS Data Sets. But Be Careful.  IMS HEALTH, Plymouth Meeting, PA ( ). Also see .

o   Stata code:

·         Included on our website is the stripping program to change variable names – making wide to long transformation easier:

·         Information on enduring conditions is in Tech Report #29 here

(Datasets with these variables may be requested by following the link ))

Derived Variables

·         Questions related to the Food Frequency Questionnaire may be found at

·         Be careful that you do not inappropriately analyse single items from a scale. For example, the 36 items in the SF-36 should not be considered as separate items, other than the first self-rated health item. The Data Dictionary Supplement has details about which scales have been included in the surveys.  (

·         Commonly used data variable cut-points may be found in the Data Dictionary Supplement or at for the following:

o   Physical activity in Report 21 page 104

o   Mental health cut-points for possible psychosocial distress in Report 16 pages 48 and 66

o   Notes regarding methods of standardising life events may be found in Report 31 page 106

Resources to help you get started

·         This could be a link to the SAS webpage, UCLA or STATA FAQ page





Free SAS tutorials

Useful paper to reference

How to cite this web page:

For example: Frequently Asked Questions.  Australian Longitudinal Study on Women’s Health.  from http:// (accessed November 10, 2016).