I got to 100 million words in my English corpus yesterday. For the last 3 months to get to that goal I’ve been adding 300,000 words a day. I have collected and analyzed the contents of 1344 books. Along the way, I have read 265 of them, so far. I have 32.5 million words in my Spanish corpus, from 368 books I’ve collected from libraries, used book stores, those precious few bookstores that carry children and young adult books in Spanish, and Amazon e-books. I started this project when I left my very sensible job to pursue this three years ago. Thanks to my family and friends and most of all my husband Mike, for their support, patience and understanding on this journey.
The corpus I based my research on was the British National Corpus of 100 million words, and was a collection of papers and texts on linguistics, chemistry and biology. There were weird words in their list of the top words like membrane that made me say ‘nu-uh!’ More recent collections seemed to suffer the same issue. Now, based on popular and classic books at a grade 5 level, I have answered for myself what are useful words for independent language learning using reading as your springboard.
In the coming week I’ll have an announcement of the first product that you can back on patreon based on the results of my research. It’s going to have the collection words that you need to get to 80% of running words you meet with in a grade 5 level book of Spanish. Spoiler alert – it’s going to be fun, and it will fit in a Christmas stocking!
Please leave a comment if you like what you’re reading. Share my site with someone you think would be interested in this. I’m not looking for venture capital, this is strictly grass-roots, and supported by the community. You, in fact. Thanks for your support.
p.s. Don’t get me wrong – I wouldn’t say no to Disney or Amazon showing up on my doorstep. But in the meantime, your support means the world to me.