On the occasion of the new Hebrew year, I thought I’d make this entry a bit lighter than usual and present blessings for the new year in the spirit of NLP, spiced up with some advice, so here goes:
When setting the goals for extraction results, try to find the optimal balance between recall and precision. Remember: Aiming high is good, just make sure there’s an easy way down!
Do not attempt to debug a rulebook for more than 24 hours straight. We’ve been there. We’ve done that. If you think you’re stuck now, you still haven’t seen nada!
Remember the chain of language processing from last entry? First comes The HTML converter – it hands down the info to CARE, which does all the hard work and in turn passes the parsed relations to the Post Processor so that it can rest on its laurels
Perfect “Anaphora Resolution” is a myth. You can try. For a while, you may even believe you’ve done it. Our prediction: In the end the bubble will burst. Or you’ll go crazy trying to solve all the problems. Or both.
Make sure that all the crucial information that comes in as input, goes out as output
Remember the engineer that didn’t take our advice and corrected extraction rules through the night?
Just like life itself, retrieving the ultimate extractions for a given relation may be an extremely tedious and laborious task. Lighten up, add some spice. Make fun of yourself!
Do everything you do with love and CARE!
But remember that if you’re not enjoying it, you’re probably not doing it right!
Here from Cambridge, MA, wishing you all a wonderful year full of wonderful experiences, CARE-ing, happiness and love!