***** Making a simple file for Korea1999 // program: 1999Korea_simple // task: create reduced HAF version //data: kor1999simple.dta clear all set memory 300m set more off set linesize 80 **** Original data: cd "C:\Users\Jiweon Jun\Box Sync\MTUS\Korea1999" use 1999aggregate.dta, clear label data "initial Korean1999 aggregate data" tolower ** #1 sort c2 c3 c4 *** #2 survey variables ***country: country or region of study *** gen country="KR" ***survey: year the survey began *** gen survey=1999 ***swave: longitudinal study wave marker *** gen swave=0 ***msamp: multiple samples using the same diary instrument *** gen msamp=0 ***#3. id variables ***hldid: household identifier*** gen hldid=c2 ***persid: person/diarist identifier *** gen persid=c3 sort hldid persid ***id: diary identifier *** /* There's no diary marker, but the first line is diary 1, second is diary2 So first create variable with markers. f(1) the beginning number */ seq a, f(1) t(2) by(hldid persid) **then rename it with id order hldid persid a rename a id ***#4 case information (day month year diary badcase) ***day: day of week diary kept *** gen day=-5 ** *** need to make from the calendar. /* 3 Friday 4 Saturday 5 Sunday 6 Monday 7 Tuesday 8 Wednesday 9 Thursday 10 Friday 11 Saturday 12 Sunday */ replace day=1 if c4==5|c4==12 replace day=2 if c4==6 replace day=3 if c4==7 replace day=4 if c4==8 replace day=5 if c4==9 replace day=6 if (c4==3 |c4==10) replace day=7 if (c4==4| c4==11) ***month: Month diary kept *** gen month=9 ***year: year diary kept*** gen year=1999 **hhldsize: number of people in household egen hld_tag=tag (hldid persid) egen hhldsize = sum(hld_tag), by(hldid) **nchild: number of children under 18 in household gen nchild=-9 ** At the moment, only aged 10+ are identified in the released data. ** Hopefully we'll be able to have full information soon. **agekidx: age of youngest child in household (categories including adults) gen agekidx=-9 *** empinclm: Original monthly employment income *** gen empinclm=-9 ** At the moment, this information is not available. ** but hopefully it may be released further soon. *** income: total household income gen income=-9 ** At the moment, this information is not available. ** but hopefully it may be released further soon. **urban: urban or rural household gen urban=-5 replace urban=1 if c152==2 replace urban=2 if c152==1 ***** in Korean data, it is 'farm/non-farm' so it is not entirely match. ***** This should be noted in the readme file. **sex:sex gen sex=c6 **age: age gen age=c7 **** Need to set age cap for people over 90 recode age (90/max=90) *** now age 90 represents aged 90 and over **civstat: Is diarist in a couple? gen civstat=2 replace civstat=1 if c10==2 gen paidwork=c15 tab c15, missing gen workhrs=-5 replace workhrs=c16 replace workhrs=-7 if paidwork==2 **empstat: employment status **c17 position **1 employed work **2 manager CEO **3 self-employed **family non-paid **1 employed full time **2 employed part-time **3 employed, unknown status **4 not in paid work *** The definition of full-time work *** we use 30 hours as threshold. */ gen empstat=-5 replace empstat=1 if workhrs>=30 replace empstat=2 if workhrs>0 & workhrs<30 replace empstat=3 if (empstat==-5 & workhrs==0) *** This variable is people reported zero hours of work, but said they did paid work replace empstat=4 if paidwork==2 ** We cannot distinguish whether these people not work because not in paid work at all, ** Or they just didn't work in this week, as no info (eg., retired, student) available. tab empstat,missing *** edcat:harmonised highest level of education gen edcat=-5 ** c8, c9 replace edcat=1 if (c8>=0 & c8<=2) |(c8==3 & (c9==2|c9==3|c9==4)) replace edcat=2 if (c8==3 & c9==1) replace edcat=3 if (c8>=4 & c9<=6) **** Made most variables. ******* Afterwards, create day & main variables. ************************************************************** *******Main Activity Code Variables ************************** tolower C* ** note that this is converted from aggregate data, not episode level data, and also ** no information on with whom & where. *** First create the main variables foreach x of numlist 1/69{ gen main`x'=0 } ***Main 1: Imputed personal or household care ** not possible to create replace main1=-9 ***Main2: Sleep and naps replace main2=c26+c27 ***Main3: Imputed sleep -- code later replace main3=-9 ***Main4: Wash, dress, care for self replace main4=c31+c32+c34+c37 ***Main5: Meals at work or school *** Not possible to distinguish replace main5=-9 ************************************************** ***Main6: Other meals or snacks replace main6=c28+c29+c30 ***Main7: Paid work - main job (Not at home) (loc!=1 bus driver etc.) replace main7=c38+c41+c48 ***Main8: Paid work at Home replace main8=c42 ***Main9: Second or other Job(not at home) replace main9=c39+c44+c45 ***Main10: Unpaid work to generate household income **** Not possible to distinguish replace main10=-9 **Main11: Travel as a part of paid work replace main11=c141 ***Main12: Work breaks replace main12=c40 ***Main13:Other time at workplace replace main13=c43+c49 ***Main14: Look for work replace main14=c47 ***Main 15: Regular schooling or education replace main15=c50+c51+c53+c54+c55 ***Main16: Homework replace main16=c56+c52 ***Main17: Leisure/other education or training replace main17=c57+c110+c111+c112+c113+c114+c58 ***Main18: Food preparation, cooking replace main18=c59+c61 ***Main19: set table/wash or put away dishes replace main19=c60 ***Main20: Cleaning replace main20=c67+c68+c69 ***Main21: Laundry ironing Clothing Repair replace main21=c62+c63+c64 ***Main22: Home/vehicle maintenance/improvement replace main22=c70+c71 ***Main23: Other domestic work replace main23=c76+c77+c79+c72 *** Main24: Purchase goods replace main24=c73+c74+c75+c136 ***Main25: Consume personal care services replace main25=c33+c35 ***Main26: Consume other services replace main26=c65+c78 ***Main27: Pet care(other than walk dog) *** Not possible to distinguish replace main27=-9 ***Main 28: Physical or medical care of child replace main28=c80+c83 ***Main29: Teach child, help with homework replace main29=c84 ***Main30: Read to, talk to, play with child replace main30=c81 ***Main31: Supervise, accompany, other child care replace main31=c85+c82+c86 ***Main32: Adult care replace main32=c87+c88+c89 ***Main33: Voluntary work, civic organisation activity replace main33=c90+c91+c92+c93+c94+c95+c96+c97+c149 ***Main34: Worship and religious activity replace main34=c115+c116+c117 ***Main35: General out-of home leisure ** Not possible to create (not possible to distinguish in/out) replace main35=-9 ***Main36: Attend sporting event replace main36=c121 ***Main37: Cinema, theatre, opera, concert replace main37=c118+c119 ***Main38: Other public event replace main38=c120+c122 ***Main39: Restaurant, cafe, bar, pub replace main39=c133 ***Main40: Party, reception, social event, gambling **** In Korean data's case, Social occasions not happening at home. replace main40=c101+c102 ***Main41: Imputed time away from home replace main41=-9 ***Main42: General sport or exercise replace main42=c125+c126+c128 ***Main43: Walking(not walk dogs) replace main43=c123+c124 ***Main44: Cycling *** not possible to distinguish replace main44=-9 ***Main45: Other out-of doors recreation replace main45=c127 ***Main46: Gardening/forage, hunt/fish replace main46=c46 ***Main47: Walk dogs (or other animals) **** Not possible to distinguish replace main47=-9 ***Main48: Receive or visit friends *** note that location is not distinguished. replace main48=c99+c100 ***Main49: Conversation *** telephoning including text messaging replace main49=c98 ***Main50: Other in-home social, games. replace main50=c131 ***Main51: General indoor leisure replace main51=c137 ***Main52: Artistic or musical activity *** Not possible to distinguish replace main52=-9 ***Main53: Written correspondence *** not possible to distinguish replace main53=-9 ***Main54: Knit, crafts, hobbies replace main54= c132+c66 ***Main55: Relax, think do nothing replace main55=c134+c135 ***Main56: Read replace main56=c129+c103+c104 ***Main57:Listen to Music, Audio book replace main57=c108 ***Main58: Listen radio replace main58=c107 ***Main59: Watch TV replace main59=c105+c106 ***Main60: Play computer games replace main60= c130 ***Main61: Send email, surf internet, computing replace main61=c109 ***Main62: No ativity but mode of recorded travel replace main62=-9 ***Main63: Travel to or from work replace main63=c140 ***Main64: Education-related travel replace main64=c142 ***Main65: travel for voluntary, civic or religious activity replace main65=c145 ***Main66: Child or adult care travel replace main66=c144 ***Main67: Travel for shopping, personal or household care replace main67=c139+c143 ***Main68: Travel for other purposes replace main68=c138+c146+c147+c148 ***Main69: No recorded activity replace main69=c150 **** Check whether 1440 minutes. recode main1 main3 main5 main10 main27 main35 main44 main47 main52 main53 main41 main62 (-9=0) *** Convert to minutes foreach var of varlist main*{ replace `var'=`var'*10 } egen tot_main=rsum(main1-main69) tab tot_main save 1999korsimple1.dta, replace use 1999korsimple1.dta, clear **** this is the data with both survey & main variables. *** 1414 have no record - so recode it into main69 gen main70=0 replace main70=1440-tot_main tab main70 main69 if main70>0 gen k=main69 replace k=main70 if main70>0 replace main69=k tab main69 egen tot_main2=rsum(main1-main69) tab tot_main2 *** Maintotal is 1440, which is correct. *** recode them back to -9 recode main1 main3 main5 main10 main27 main35 main44 main47 main52 main53 main41 main62 (0=-9) *** main69 done. recode main1 main3 main5 main10 main27 main35 main44 main47 main52 main53 main41 main62 (-9=0) *** simple gen sleep= main2+main3 gen eatdrink=main5+main6 gen selfcare=main1+main4 gen paidwork=main7+main8+main9+main10+main11+main12+main13+main14 gen educatn=main15+main16+main17 gen foodprep=main18+main19 gen cleanetc=main20+main21+main23 gen maintain=main22 gen shopserv=main24+main25+main26 gen garden=main46 gen petcare=main27+main47 gen eldcare=main32 gen pkidcare=main28+main31 gen ikidcare=main29+main30 gen religion=main34 gen volorgwk=main33 gen commute=main63+main64 gen travel=main62+main65+main66+main67+main68 gen sportex=main42+main43+main44 gen tvradio=main57+main58+main59 gen read=main56 gen compint=main60+main61 gen goout=main35+main36+main37+main38+main39+main40+main41+main45 gen leisure=main48+main49+main50+main51+main52+main53+main54+main55 gen missing=main69 gen restrnt=main39 gen eatatwrk=main5 gen compgame=main60 gen caretrav=main66 gen sppart=-9 *** not possible to create at the moment. *** recode them back to -9 recode main1 main3 main5 main10 main27 main35 main44 main47 main52 main53 main41 main62 (0=-9) save 1999korsimple1.dta ** make nowght. * missing 90+ min *<7 episodes * missing eating or drinking * missing sleep or rest * missing personal care * missing travel ** missing eating or drinking & travel list hldid persid id if (eatdrink==0 & restrnt==0 & commute==0 &travel==0&sportex==0& caretrav==0) & main69<90 *** * hldid persid id 10016 1 2 ==> has sleep hour of 1440 minutes ==> nowgt=1 * the other two (14114 2 2 , 14339 1 1 ==> visit/received friends (in Korean culture, it is likely to have meals with them), and has some food preparaton time ** missing eating /self care list hldid persid id if eatdrink==0 & restrnt==0 & selfcare==0 & main69<90 ** same person (10016) ** missing eating and sleep - 0 person list hldid persid id if eatdrink==0 & restrnt==0 & sleep==0 & main69<90 ** missing sleep & travel list hldid persid id if sleep==0 & commute==0 &travel==0&sportex==0& caretrav==0 &main69<90 **0 ** missing personal care and sleep list hldid persid id if sleep==0 & selfcare==0 & main69<90 list if sleep==0 & selfcare==0 & main69<90 ** 6204 2 1 : this person put 1270 minutes of paid work, and job is labour ** maybe nightshift? ** missing personal care and travel list hldid persid id sex age if selfcare==0 & commute==0 &travel==0&sportex==0& caretrav==0 &main69<90 tab age if selfcare==0 & commute==0 &travel==0&sportex==0& caretrav==0 &main69<90 list hldid persid id sex age if selfcare==0 & commute==0 &travel==0&sportex==0& caretrav==0 &main69<90 *** there are 443 diaries, but as we are using the aggregate file, it is likely that these are good diaries defined by the coding procedure. ** conclusion. gen nowght=-5 replace nowght=1 if main69>90 replace nowght=1 if ( eatdrink==0 & restrnt==0 & selfcare==0) tab nowght replace nowght=0 if nowght==-5 gen propwt=-5 gen weight=c25 *** two or more * missing sex/age/day of the week *** no data has missng above. keep country survey swave msamp hldid persid id day month year hhldsize nchild /// agekidx income urban sex age civstat empstat workhrs empinclm edcat propwt sppart /// main* sleep eatdrink selfcare paidwork educatn foodprep cleanetc maintain shopserv /// garden petcare eldcare pkidcare ikidcare religion volorgwk commute travel sportex /// tvradio read compint goout leisure missing restrnt compgame eatatwrk caretrav nowght weight order country survey swave msamp hldid persid id day month year hhldsize nchild /// agekidx income urban sex age civstat empstat workhrs empinclm edcat propwt sppart /// main* sleep eatdrink selfcare paidwork educatn foodprep cleanetc maintain shopserv /// garden petcare eldcare pkidcare ikidcare religion volorgwk commute travel sportex /// tvradio read compint goout leisure missing restrnt compgame eatatwrk caretrav nowght weight save kr1999hsf.dta