Noob Academic

A case study: NLG meeting weather industry demand for quality and quantity of textual weather forecasts by S. Sripada, N Burnett, R. Turner, J. Mastin & D. Evans

April 01, 2016 | 3 Minute Read

Kwisebe lemo-yezulu kudingwa indlela zokwenza iingxelo zemo-yezulu ngexesha elifutshane. Kudingeka iingxelo ezilungele izidingo zomntu nganye kwaye akufuneke zibe bubuvuvu ezo ngxelo. Eliphepha libonisa izizathu zokuba ababhali bacinge ukuba singakwazi ukuzifumana zonke ezizinto. Lo msebenzi uqhutywa yikomponi i-Arria NLG, UK national weather agency kunye ne-Met office.

Nmahlanje okanye kwixesha esiphila kulo, imo yezulu ixelwa ngokusebenzisa imifanekiso, ii-maps kunye nezinye indlela. Inye into esakhutshwa okanye esenziwa ngesandla somntu nangoku - kukubhala iingxelo. Ithatha ixesha le nto. Ingaba yinto entle ukuyenza into ekhawulezayo le ngoba abasasazi bemo-yezulu bangafumana ixesha lokudilishana nezinye izinto ezibalulekileyo. Amaziko emozulu kufuneka aphucuke, into ebalulekileyo edingekayo kukunika ingxelo yemozulu yendawo yonke elizweni. Le nto inyusa isixa semibhalo edingekayo - ngaphezu koku ibangela kubenzima ukuba silindele abantu bakwazi ukubhala iingxelo ezidingekayo.

Kweliphepha, ababhali baveza ubuchule be-NLG ababusebenzisileyo ekwenzi i-system esetyenziswa yi-ofisi yemoyezulu e-UK, i-Met office. I-Met office ixela imoyezulu kwindawo ezingangamawaka amahlanu. Le ofisi inayo nendawo kwi-website yayo apha ingxelo inekezelwa ngamagama. Le ngxelo yamagama ayizicacisi zonke indawo ngenxa yokuba ukwenza njalo kungathatha ixesha. Le system yakhiwe ngababhali ibizama ukulungisa le nto. Le system isebenza kwi-khumpyutha eqhelekileyo, ikhupha imibhalo eyi-15000 ngaphantsi komzuzu.

Sifuna ukwazi ngoku ukuba ungaba ukhona na umsebenzi owenzwe ngabanye abantu oyeleleneyo kunalo wababhali. Ukukhupha iingxelo zemozulu usebenzisa ikhopyutha, yinto yokuqala eyayisenziwa zingcali ze-NLG. u-Bateman kunye noZock[1] baqokelela amagama kunye nomsebenzi we-NLG system ezininzi kwaye kuyabonakala ukuba uninzi lwazo zilishana nemozulu. I-system yokuqala kulento yayi-FOG. Yayikhiwe ukwenzela isebenze phakathi kwe-Forecast Production Assistant. Zikhona nezinye ii-system ezakhiweyo. Eliphepha licacisa/lithetha nge-system ezininzi ezenza izinto ezidibene nemozulu kodwa alixeli ukuba zintoni ezingekhoyo kwezi-system – ezizongezwa yile bayikhayo. Ababhali bavele bathi, ekugqibeleni, eyona nto ezingenayo ezi-system bebethetha ngazo kukukhupha imibhalo emininzi ngexesha elifutshane. Umbuzo esinawo ngulo; ingaba ibalulekile lo nto?

Le system yakhiwe ngababhali, yakhiwe beqwalasele izinto ezintlanu. Phambi kokuba siqhubekeke, kubalulekile ukuba sixele ukuba le system yakhiweyo ayizocaciswa ngokupheleleyo ukuba isebenza njani. Isizathu salonto yinto yokuba bayayithengisa le system kwaye abafuni ibiwe. Into abayithethileyo kodwa yinto yokuba isebenzisa i-Arria NLG Engine. Bathi le system ilandela amanyathelo amahlanu kwinkqubo ye data-to-text. Kuyafuneka ajongwe lamanyathelo ngoba kwiphepha lika-Reiter no-Ehud[2] amanyathelo adingekayo mathandathu, amawekho mahlanu. Zintathu intlobo zolwazi ngemoyezulu ezifakwa kule system:

1. Ingcazelo zengqikelelo zemoyezulu. Izinto ezifana
 ne-temperature, wind speed ne-direction, visibility
 rhoqo ngeeyure ezintathu kunye ne-mvula.
2. Average daily and nightly values yezizinto zingentla.
3. Seasonal averages for temperature.

Asilibelanga ukuba besithe kukho izinto ezintlanu ebebeziqwalasela ababhali ekwakheni le system.

1. Ukucacisa ukuhla komngangatho
weegqikelelo zeentsuku ezikude/ezilandelayo.
2. Ukungabikho kwe-corpus
3. Ukukhupha imibhalo yomgangatho ophezulu.
4. Ukukhupha imibhalo emininzi.
5. Ukwenza i-system engasetyenziswa kwihlabathi lonke.

Kufuneka sikhe siqwalasele umsebenzi okanye ukubaluleka kwe-box plot. Eliphepha lifaka i-box plot kodwa ayongezi lwazi kweliphepha. Eyona nto ibalulekileyo ngokubona kwam kule case study kukwazi ukuba bawugcina njani umganatho wemibhalo. Umgangatho uhlolwa zingcali kunye nabantu abayikhatheleleyo imoyezulu. Ababhali bathi ukuba iingcali ezisuka e-Met office ziyavuma ukuba umgangqtho awuhlanga. Benze i-survey kwi-website ye-Met office befuna uluvo lwabantu abasebenzisa/abakhathelele imoyezulu. Bafumene uluvo lwabantu aba-35. Ingxaki endinayo ngale-survey yinto yokuba akukho nto ibonisayo ukuba ababantu bayiphenduleyo abazongcali eziphangela e-Met office. Iziphumo zale-survey zibonisa ukuba abantu bacinga ukuba umgangatho wale-survey awuhlanga kwaye isixa sombhalo nganye silungile. Umbuzo wale survey yabo kodwa uyigngxaki. Lo mbuzo uthi:

Would you recommend this feature?

Ingxaki yinto yokuba abathethi nge-feature yemibhalo yodwa. Kwi-website ye-Met office, kukho izinto ezininzi kulandawo babonisa le system. I-website inika imifanekiso, itafile kunye nemibhalo yemozulu. Lo mbuzo akacacisi ukuba uthetha ngemibhalo yodwa.

  • [1] Bateman J and Zock M, (2012) Bateman/Zock list of NLG systems, http://www.nlg-wiki.org/systems/
  • [2] Reiter, Ehud, and Robert Dale. “Building applied natural language generation systems.” Natural Language Engineering 3.01 (1997): 57-87.