Noob Academic

Umsebenzi we-corpus kwi-NLG system ngokutsho kuka Dale n-oReiter

May 16, 2016 | 1 Minute Read

Namhlanje sibuyela emsebenzini kaDale kunye no-Reiter. Sizakucacisa izinto ebesizithatha kancinci. Eyona nto sizama ukuyiqwalasela kwakhona yindlela i-NLG system esebenza ngayo. Umbuzo wokuqala endinawo; ingaba i-corpus yeyokuthini?

Lo mbuzo ingathi singawuphendula ngokuqwalasela umbhalo apho uthetha nge-system specification. Ababhali bathi kumatyeli amaninzi; lo mbhalo wabo uzokuthetha ngendlela yokwakha ii-NLG system apho umntu asebenzisa iingqokelela ze-inputs kunye nemibhalo ekhutshwa yi-system. Akukho lula ukucacisa umsebenzi we-system ye-NLG xa usebenzisa kwaye ugxininisa iinkcukacha ze-NLG. Yiyo lo nto kunceda ukusebenzisa ii-inputs kunye ne-outputs zee-system ezo. Xa sizibiza ezi-inputs kunye ne-outputs zazo, sithi yi-corpus. Isininzi seligama lithi; corpora.

Ababhali bathi nangona kumacala ezifundo amaninzi, i-corpus yingqokelela yee-inputs zodwa. Kwi-NLG, kodwa, kunyanzelekile ukuba ihambe ne-outputs. Ababhali baveza umsebenzi ka McKeown et al[1] bona bazama ukubonisa indlela i-corpus ingasetyenziswa njani

Practical Issues in Automatic Documentation

u-McKeown et al[1] bakha i-system enegama eliyi-PLANDoc. Umsebenzi wale system kukuthatha i-trace yomsebenzi wenjineli kwi-networking tool ikhuphe isishwankathelo esingangephepha elinye okanye amabini. I-corpus isetyenziswa ekuboneni ukuba izishwankathelo zibanjani. Ngamanye amazwi, i-corpus isetyenziswa ekufumaneni ukuba yintoni ekufuneka ibekho kumbhalo ozakukhutshwa yi-system. Enye into eyenza i-corpus ibaluleke kukuma kwezinto kwisishwankathelo.

Izivakalisi

Into elandelayo emva kwalento siyithethileyo nge-corpus, yinto yokuba ingaba yenziwa ngesandla le nto yokuqwalasela amagama asetyenziswayo kwimizekeliso yemibhalo ekumele ukuba ikhutshwa yi-system. Xa siqwalasela i-corpus, kufuneka sikwazi ukuxela ukuba isivakalisi nganye singena kweyiphi i-category:

  1. Umbhalo ongatshintshiyo - awusuki kwi-input lo mbhalo
  2. Ulwazi oluthathwa ngobunjalo babo kwi-input.
  3. Ulwazi olubaliweyo - aluvelanga ncam kwi input. Olulwazi luphuma ngokusebenzisa izixhobo ezifana ne-reasoner.
  4. Ulwazi olungekhoyo kwi-input. Alusuki kwi-reasoner olu.

I-corpus ungayisebenzisa ngendlela ezininzi ekufumaneni ulwazi lombhalo ozakukhutshwa yi-system yakho.

References

  1. McKeown, K., Kukich, K. and Shaw, J., 1994, October. Practical issues in automatic documentation generation. In Proceedings of the fourth conference on Applied natural language processing (pp. 7-14). Association for Computational Linguistics.