Noob Academic

Toward a knowledge-to-text controlled natural language for isiZulu by Maria Keet & Langa Khumalo

February 26, 2016 | 3 Minute Read

Namhlanje sizokuqwalasela ukuba bazakhe njani iindlela zoku-verbaliser kwisiZulu. Bakhethe izinto ezimbini abazoziqwalasela. Ezi zinto zi-quantifiers kunye ne-object properties. Zimbini izizathu zalonto. Esokuqala, ezizinto zimbini zezona sisetyenziswa kakhulu. Ithi lo nto zininzi ii-applications ezingaphuma kulomsebenzi. Isizathu sesibini sithi: sele ziliqela iilwimi ezi-controlled kwicala le NLG apho kusetyenziswa ii-Description logics ze-OWL. U-Engelbrecht kunye nogxa bakhe bazamile ukwenza isiZulu se-Science. Bayeyenziswa le nto kuba bezama ukuncedisa kwi-nursing kunye ne-healthcare. Umsebenzi lo wabo ukwazile ukuguqulela i-SNOMED CT ukuze izokwazi ukusetyenziswa ngabantu abathetha isiZulu. Umntu ofuna ulwazi olubanzi nge-SNOWMED CT angafunda i-website ye-IHTSDO[1]. Baziqwalasele zombini i-universal kunye ne-existential quantification. Baqwalasele nee-object properties apho izenzi zazo zidinga i-conjugation kunye ne-conjuction. Bathi i-disjunction ilula, abazokuyihoya ngenxa yalonto.

Universal quantifier

Sizakuqala sinjonge i-universal quantifier. Babhali bazama ukwenza izinto zibelula ngokujonga i-universal quantifier esekuqaleni kwesivakalisi. EsiNgesini kusetyenziswa u-“all” kunye no-“each”. KwisiZulu kodwa kusetyenziswa abagama amaninzi kodwa anengcambu enye engu “-onke”. Isimaphambili salengcambu sixhomekeke kumanye amagama kwisivakalisi. Kubalulekile ukuba ungalibali ukuba ngenxa yokuba iszakujonga izivakalisi apho i-universal quantifier esekuqaleni kwesivakalisi, ithi lo nto izokulandelwa sisibizo. Isimaphambili sesiqu se-universal quantifier sixhomekeke kwi-tafile yesibini ekwiphepha lika Keet noKhumalo[2]. Ingxaki enkulu yinto yokuba akukho lula ufumana ulwazi oluphangaleleyo ngaletafile. I-website ethathwa kuyo ayisasebenzi. Le tafile ibonisa ukuba usebenzisa esiphi isimaphambili xa usazi ukuba isibizo sikweliphi ihlelo. Ababhali bathi ukuba uyaqwalasela, izimaphambili azitshintshi. Ngamanye amazwi, isimaphambili sihlala sisiso ngahlelo nganye. Ngenxa yokuba letafile ka-Goldsmith noButhelezi isinika zonke izimaphambili, bathi ababhali ukuba awudingi ukuyazi ukuba isibizo sikweliphi ihlelo. Bathi eyona nto kufuneka siyazile kukuba ingaba sizokusebenzisa isinye okanye isininzi sesibizo. Ndicinga ukuba ayonyani landawo yokuqala kwesasivakalisi. Nangana kunjalo, bathi zikhetha phakathi kwezizinto zilandelayo:

N = isibizo esithathwa kwigama le-class ye-OWL.
M = isininzi sesibizo u-N
QC = Quantitive concord
C = ihlelo lesibizo u-N
D = ihlelo lesibizo u-M

(a) <QC(all) for C>onke <N>
(b) <QC(all) for D>onke <M>

Ndinemibuzo emininzi ngalento. Owokuqala uthi; ingaba sidilishana njani nesikhamiso ekupheleni kwesimaphambili ngoba u “-onke” sele enaso. Siyazi ukuba kwiilwimi zesiNguni asikwazi ukuba nezikhamiso ezilandelanayo. Owesibini umbuzo uthi; kutheni besithi ababhali asidingi kulazi ihlelo lesibizo - kum, ingathi kunyanzelikile silazi.

Existential quantifier

Ngoku sizokuthetha nge-existential quantification. Ababhali bathi bazakujonga o-OWL EL kuba ulula nokuba uzakukhupha imizekelo emininzi. Lemizekelo ndithetha ngayo ungayifumana kwiphepha leshumi elinesibini[3]. EsiNgesini kusetyenziswa amagama amabini xa kuthethwa nge-existential quantification, lamagama ngu-“some” kunye no-“at least”. NakwisiZulu kusetyenziswa amabini. Xa sidilishana no-“elilodwa” owenza umsebenzi omnye no-“at least one” esiNgesini, senza ngoluhlobo:

<RC><QC>dwa

Udibanisa i-relative concord kunye ne-quantitive concord. Xa zidibene zizokwenza isimaphambili. Ukusukela ngoku sizakuthi isiqu kunye nesimamva ngumsila. Izakwenza lula ubomi bethu lo nto. Esisimaphambili sisincamathisela kumsila wethu u “-kodwa”. Andixolanga yilento kodwa. Ezi-concords suzifumana ngokusebenzisa isibizo kwaye ndiyaqonda ukuba usebenzisa i-object kwisivakalisi. Umbuzo endinaye yinto yokuba ingaba uyazi njani ukuba yeyiphi i-object kwisivakalisi.

Xa sidilishana no-“some”, izinto azibikho lula kakhulu ngoba ezizivakalisi zingezantsi zithetha into enye:

1. zonke izindlovu zidla noma yiliphi ihlamvana
2. yonke indlovu idla ihlamvanathize

3. yonke indlovu idla ihlamvana elithize

Ezi zimbini izimbini izivakalisi, ezokuqala, zisuka kweli phepha lika Keet kunye noKhumalo. Ndicinga ukuba nesisesithathu isivakalisi sithetha into enye. Asikwazi ukungasihoyi ngoba i-clitic u “-thize” inesimaphambili. Kubalulekile ukuba siyazi ukuba sisakha njani. Sizokuthetha ngezizimbini zodwa namhlanje. Ezi zivakalisi zisebenzisa u-“noma” zakhiwa oluhlobo:

copulative + enumerative concord (EC) + enumarative suffix

Eligama lithi “noma” lisetyenziswa kuba kuthethwa ngokubala. umlingani walo kwisiNgesi ngalamagama athi “some among many”. Kubalulekile ukuba uqwalasele ukuba ungasebenzisa u “-thize” okanye “-thile”. Yomibini lemisila izi-clitics kwaye zincanyathiselwa kwisibizo. Isizathu yinto yokuba zibonisa ukuba esosibizo siphakathi kwezinye. Ngamanye amazwi, isibizo sicacisa into enye okanye ezininzi. I-universal quantification ifunanywa ngokusebenzisa ezi-templates zikwiphepha lika Keet noKhumalo[4]. Ababhali bathi le yokuqala yeyona ilula. Ngaphezulu koko, i-survey yabo ibonisa ukuba abantu bakhetha yona kunezinye. Ngamanye amazwi, abantu bacinga ukuba ikhupha izivakalisi ezingcono.

  • [1] The International Health Terminology Standards Development Organisation, http://www.ihtsdo.org/snomed-ct, Retrived 24 Feb 2016
  • [2] Keet, C.M. and Khumalo, L., Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, pp.7.
  • [3] Ibid, pp.12
  • [4] Ibid, pp.13