International Business and Economics Review | nº5| 2014

ISSN:1647-1989

e-ISSN 2183-3265

-ISSN 2183-3265












NATALIYA GODINHO SOARES VIEIRA, PhD Researcher, Portuguese Centre for Global ALIYA PhD History/CHC, Faculty of Social Sciences and Humanities, NOVA University of Lisbon, Portugal


The business sphere is a multilingual world where foreign language communication skills are crucial in international relations. It makes employers look for business professionals who have a ational high level of linguistic competences. Language proficiency increases the chances of negotiation among partners. There are mainly two obstacles that make barriers in formal communication in a foreign language: lack of knowledge of specific linguistic structures or terminology and frequent transitions from one language to another.

This paper contributes to the quest for quick access to a wide range of English, Spanish and er Russian online databases that provide authentic language samples. Their application may improve communication skills and facilitate preparation for business discourse.

KEYWORDS: Business communication; Foreign languages; Online corpora; Communication skills; Translation equivalents.


A esfera dos negócios é um mundo multilingue, onde a capacidade de comunicação com recurso a línguas estrangeiras é crucial nas relações internacionais. Por esta razão os empregadores procuram profissionais com um elevado nível de competências linguísticas. A proficiência nas línguas aumenta as possibilidades de negociação entre parceiros. Existem dois obstáculos principais criadores de barreiras numa comunicação formal em língua estrangeira: a falta de conhecimento das estruturas linguísticas específicas ou de terminologia e transições frequentes de uma língua para outra.

Este artigo contribui para a busca de um acesso rápido a uma vasta gama de bases de dados online em inglês, espanhol e russo, que contenham exemplos de língua autêntica. A sua aplicação pode melhorar a proficiência comunicativa e facilitar a preparação do discurso empresarial.

PALAVRAS-CHAVE: Comunicação empresarial; Línguas estrangeiras; Online corpora;


Proficiências comunicativas; Equivalentes de tradução.

–  –  –

International relationships lead business professionals to confront situations where they need to consult foreign language dictionaries, reference books or spend some time on the Internet, searching for the proper key terms, lexical compatibility of word forms, well-formed grammatical structures, etc. In most cases, in cooperation with foreign markets, business communication occurs via e-mail. A variety of business correspondences such as commercial documents and newsletters require translation into foreign languages. Some such problems may be resolved with the help of online dictionaries or specific programs, for instance, Google Translator Thesaurus Linguee (http://translate.google.com/), (http://thesaurus.com/), (www.linguee.com), Collins (http://www.collinsdictionary.com/), SDL Trados Studio (video tutorial: http://www.youtube.com/watch?v=w0rAA8baU_Y), etc.

In the last decade a large number of studies in the field of Corpus linguistics have confirmed the advantages of the application of online corpora in variety of works related to foreign language perception and production (Bowker, 2000; Aston, 2001; Bowker & Pearson, 2002; Laviosa, 2002; Anderson & Corbett, 2009).

Corpus (plural: corpora) is “a large collection of authentic texts that have been gathered in electronic form according to a specific set of criteria” (Bowker & Pearson, 2002: 9). There are mainly four types of online corpora: monolingual, parallel, multilingual and multimedia.

Monolingual corpora provide users with access to a single-language electronic database. These corpora may be called national corpora (for example, British National Corpus, National Corpus of Polish, Academia Sinica Balanced Corpus of Modern Chinese, Russian National Corpus, Corpus of Modern and Diachronic Spanish of the Royal Academy, Corpus of Contemporary American English, Corpus of Spanish).

Bidirectional parallel corpora contain aligned concordances in two languages (for instance, COMPARA – bidirectional parallel corpus of English and Portuguese). Multidirectional parallel corpora provide aligned concordances in more than two languages (for example, CLUVI – Linguistic Corpus of the University of Vigo, OPUS – collection of translated texts from the web).

Multilingual corpora are collections of individual monolingual corpora in several languages (for example, The PolyU Language Bank).

Multimedia corpora may consist of audio materials as well as video recorded files with their transcriptions (for example, ELISA – English Language Interview Corpus as a SecondLanguage Application, SCOTS – Scottish Corpus of Texts & Speech, BACKBONE – pedagogic corpora of video-recorded interviews).

International Business and Economics Review | nº5| 2014 Every corpus has its own size, structure and design. For instance, the Corpus of Contemporary American English (COCA) consists of more than 425 million words; the Business Letter Corpus (BNC) provides 1 million specified words or phrases. Some large corpora may consist of various sub-corpora of the above-mentioned types. For example, the Russian National Corpus includes a historical corpus, dialectal corpus, multimedia corpus, spoken corpus, poetic corpus, as well as multidirectional parallel corpora, etc.

Corpora are annotated with morphological, syntactic and semantic information. The specific tagging systems help to describe lexical items (word boundary tagging, part of speech tagging, sense tagging, syntactic relation tagging and semantic relation tagging). As corpora are deliberately selected collections of texts, they may provide multifunctional assistance related to foreign language production, specifically, to identify correct grammatical forms or usage of articles, to select a set of descriptive adjectives, to check translation equivalents, and so on.

Corpora may serve as personal foreign language advisors or tutors. Corpora help to identify a set of concordances retrieved from different kinds of authentic discourses. This information may be applied in error corrections or in preparations of some papers.

Taking into account the fact that electronic corpora are multifunctional tools (they can include statistical, bibliographic, sociolinguistic data and other extra materials), users have to adapt to the ways they work at the initial stages. In most cases, the process of adaptation does not take much time, because the front page of electronic corpora provides visual and practical instructions.


The nature of any one electronic corpus is different from the nature of any other one. This has to do with the purposes that each corpus serves. However, electronic corpora are applied in many different practical ways – to search for lexical items, word-forms, other language structures, to listen to audio or video records and to access their transcriptions, etc.

Technically, corpora provide a quick search of information that is systematized, classified and visible. The most electronic corpora have free access; some of them require a simple registration. There is not a definite methodology for work with corpora. It is essential to be familiar with their functions and possibilities as well as the terminologies that are frequently used

in their databases. There are some important definitions:

Concordancer “is able to recover from text all the contexts for a particular item (morpheme, word or phrase) and to print them out in a way which facilitates rapid scanning and comparison.

The most usual format is the keyword-in-context (KWIC) concordance in which the keywords

–  –  –

Table 1. Some selected concordances that contain request from the Business Letter Corpus (http://www.



–  –  –

In modern political, economic and commercial discourse in different languages we can find a lot of ´anglicisms´ (“words or phrases borrowed from English into a foreign language”, Oxford dictionaries: http://www.oxforddictionaries.com). The use of anglicisms may cause problems about their functioning in the texts that are not written in English. If we need to obtain statistical data that provide text samples on application of the term leasing in the Spanish-speaking world of policy, economics, commerce and finance, the information would be acquired with the help of the Corpus of Modern Spanish of the Royal Academy (Real Academia Española. Corpus de Referencia del Español Actual). This corpus is very suitable for verification of lexical items in the texts related to a large number of fields that are selected from different sources belonging to Spanish-speaking countries.

Table 2- A front page of the Corpus of Modern Spanish of the Royal Academy (http://corpus.rae.es/creanet.html): typing leasing in a search line.

To find concordances that contain the specific term leasing, it is necessary to type this item in a search line (Table 2), select Política, economía, comercio y finanzas in the option Tema and click on Buscar.

–  –  –

As a result we have obtained 81 cases of the use of the term leasing in 41 documents of the specific field (Table 3). After that we can click on Ver estadística and find out more detailed information about the functioning of this term in Spanish-speaking countries (Table 4).

Table 4- Statistical data: the functioning of a term leasing in the spheres of policy, economics, commerce and finance in Spanish-speaking countries.

–  –  –

As we see, electronic corpora may be successfully integrated into the work required for continuous improvement of communication skills, helping to update and consolidate language knowledge.

There are many ways to apply electronic corpora for different purposes. We will present some of them, paying more attention to the implementation of corpora as effective foreign language informational advisors and personal language trainers for business professionals.

–  –  –

Participating in negotiations and keeping in touch with foreign colleagues cause situations when business professionals need to adapt to non-native accents or different forms of speaking in the same language. Some types of audiovisual electronic corpora may help in speech-perception trainings.

Audiovisual corpora are original softwares that allow exploration of authentic oral discourse in visual or verbal forms. Audiovisual corpora contain both audio and video files. Audio corpora consist of audio files and text transcriptions without their demonstration in visual form.

In preparation for contacts with English-speaking partners, customers, buyers, etc., we can ask for help at the English Language Interview Corpus as a Second-Language Application – ELISA (http://www.uni-tuebingen.de/elisa/html/elisa_index.html#topic_keys). This corpus includes video interviews with native English speakers from various regions of the English-speaking world. There are the interviews with a city councillor of the community of New Mexico/US, a travelling businesswoman who works in publishing and advertising in Bermuda, the Caribbean, South and Central America and the Pacific, a tour guide from Australia, a language teacher from Oxford University, a photographer from Birmingham, a project manager from the West Midlands/UK, etc. (Table 6). The rich database of ELISA helps gain familiarity with varieties of English and to develop language awareness. This corpus lets the user observe differences in articulation of the same sounds or in the use of weak forms, as well allowing us to analyse how speakers divide their utterances into units, characterised by falling, rising and level tones and to listen to a variety of intonation types in which speakers make statements, exclamations, etc.

ELISA provides transcribed speech of all verbal interviews which is an important aspect for selftest listening comprehension.

Table 6- ELISA – Browse interviews: a video interview with the owner of the horse-back riding company.

–  –  –

For instance, MURCO may retrieve relevant video files with necessary lexical items or word forms in a context of full dialogs that are useful to listen to the pronunciation of the Russian initial, middle and final clusters consisting of more than three consonants (агентство (agency), производство (manufacture), партнерство (partnership), философствовать (philosophize), приветствовать (greet), etc).

Table 7- MURCO – Browse concordance of приветствовать (greet).

The abovementioned examples confirm that audiovisual corpora are very practical in their application. They response to the problems that individuals have, trying to speak foreign languages.

