Even if you are a talented researcher, you have to write about your research impressively to make a real impact in the ever-evolving world of science. In reality, for most, research and writing are two absolutely different skills. Even the gifted researchers struggle when it comes to writing coherently and succinctly. Naturally, it’s always much more difficult for non-native English speakers. Though English is by and large the global language of science, it is also one of the most complex languages to learn. What is the difference between “stand by” and “stand off”? Do you “take” a test or “give” a test? When you require assistance with the English language, where do you go? A lesser-known—yet comprehensive—source of aid for academic writing is language corpora. Here, we will inform you how you can make the most of this powerful resource to enhance your writing and build your overall confidence with English.
What is a Language Corpus?
A language corpus is fundamentally a collection of electronic text used specifically for research purposes. Some of the popular corpora include the Corpus of Contemporary American English (COCA), Google Books Ngrams viewer, Corpus of Historical American English (COHA), Michigan Corpus of Academic Spoken English, and Hyper Collocation. These corpora provide a searchable collection of English used by native speakers in various contexts. In English language classes, they are used as a tool by teachers who want to demonstrate to the students how a word is typically used by native speakers.
What is the difference between a dictionary and a corpus? Why should non-native English speakers prefer a corpus over a dictionary for answers? First, a dictionary always gives you the meaning of a word, but it hardly ever includes several usage examples. The word “extract” in plain English means “to remove.” But if I have to explain a physical action I took during my research, will I write “extract from” or “extract to”? A dictionary may not be able to answer that question holistically, and that’s where a language corpus comes in.
Once you get familiarized with the simple corpus search functions, a new range of tools will become available to you. Corpora often allow searches for synonyms and various word forms. For instance, you can search for the verb form of “extract” in COCA and get “extracts,” “extracted,” “extracting,” and “extract.” You can also choose “collocates” for the search string and return a list of words that are regularly found together with the word “extract.” Selecting the “help” icon will give you a range of search function methods. For instance, if you type [=extract], you will get a list of synonyms for the word, such as remove, get, fetch, and separate.
Unlike dictionaries, language corpora are updated a lot more frequently. A search in the Oxford dictionary in early 2020 would not have come up with a result for “bioabsorbable.” This word has been in use and popularized due to technological breakthroughs presented in 2019. It was officially added to Merriam Webster’s in mid-2019. If you were searching for examples of how to write using this word, corpora would be there to supply you with instances of its current use.
How do I Use Language Corpora?
Learning to search using various corpus tools can be initially confusing. But the great news is it gets easier swiftly. Let’s take a look at how to select a corpus and search for words on these sites to get useful outcomes.
You should select your language corpus, depending on what your aim is. If your goal is to know how to use a word that is not specific to your field, then COCA will be the best place to begin. Let’s say you wish to find out if you should say “extract from” or “extract to.” You can select the link to COCA above and enter “extract to” in the text field. Then click “Find matching strings.”
When we execute this search, “extract to” returns just 52 uses and “extract from” returns 233.
We can select “Context” to see how it is used. Based on the search results, we will know that “extract from” is the correct phrase to use when the words that follow it denote a source or material, whereas ‘extract to’ is used when ‘extract’ is a noun.
If you want to find more discipline-specific words, you can use the Michigan Corpus of Academic Spoken English (MICASE). The benefit of MICASE is that you can search by discipline or type of academic event. If you are writing to prepare for a conference or presentation or are branching out into a new part of your discipline, this tool can be of great help to you.
You may also be thinking about the differences between British and American English. The good news is there are corpora to assist you with those searches too. The BYU Corpus site has links to American English and British English corpora—you can search and figure out what phrases or terms are used in one style over the other. Should we say “in the hospital” or “in hospital”? A search of the corpora reveals that Americans prefer “in the hospital,” while the British prefer “in hospital.”
A Few Words of Caution
You are surely going to be excited about using this brilliant new tool. And you should be! Language corpora can be extremely helpful in supplying you with real-world examples of language that you would rarely find elsewhere. Dictionaries and Google searches provide much less detail and context that corpora do. But, there are certain points of caution to keep in mind when banking on corpora to enhance your writing. Remember that corpora don’t inform you what is correct and incorrect; they tell you what usage is common. You can use corpora to augment your writing, but you will need to compare your data with other corpora. For example, search for “extract to” and “extract from” in COCA and MICASE. This will help solidify your decisions on grammar or vocabulary.
That said, the important aspect to keep in mind is that language is all about communication. When you try to find out how to use specific words, real-world examples are truly a great tool that can provide you a deeper understanding of the words. That is why language corpora are an outstanding tool to have, especially when it comes to strengthening your academic writing.
Do you utilize language corpora to assist you in academic writing? Which corpus have you found to be the most helpful? What are other great resources for ESL writers to enhance their academic writing? Do share your views in the comments section below!
If you are looking for a proven tool to enhance your writing dynamically, then try out Trinka, the world’s first language enhancement tool that is custom-built for academic and technical writing. It has several exclusive features to make your manuscript ready for the global audience. www.trinka.ai