5. Utterance segmentation process


In this folder you will find the Instructions and tools for utterance segmentation and coding - Enter the Transcription Protocol folder, then open the C-unit summary pdf. This document contains specific instructions for utterance segmentation and examples. 

  1.  

    1. For Bilingual Speakers of Spanish-English or Castellano-Catalan, the following utterance segmentation should be used- see item b and figure below.

    2. See this description on modified C-Units

 

  • Pro-drop: A linguistic phenomenon in which a subject/subject pronoun is not explicitly stated with a verb. Certain languages are considered to be pro-drop languages (Spanish, Catalan, Portuguese, Italian, Hindi, Swahili, Turkish, etc.)

 

  • Prepositions: Words that precede a noun or phrase which they modify by expressing some type of relation. For example, some English prepositions are:

    • about, above, across, after, against, among, around, at, before, beside, between, by, down, during, except, for, from, in, into, near, of, off, on, over, through, to, toward, under, up, with  

 

  • Conjunctions: These are words that link together two or more words or phrases. The conjoined units typically are the same part of speech. In English, common conjunctions are:

    • and, but, or, neither... nor..., so

 

  • Subordinate clause: Phrases that are dependent (cannot stand alone) and rely on main/matrix clauses.

 

  • Utterance = Enunciado

 

Utterance Segmentation: Modified Communication Units (MC-units)

ISSUE: The basic unit for segmenting utterances used in SALT is the communication unit (C-unit; an independent clause and its modifiers, including subordinate clauses). Thus, a sentence like, the boy went running and grabbed the frog, would be segmented as one utterance. Although the equivalent of this sentence in Spanish, el niño estaba corriendo y agarró la rana, could also be segmented as one utterance, doing so would possibly ignore the pro-drop nature of Spanish. Whereas omitting subject nouns or pronouns is typically ungrammatical in English, these can be grammatically dropped in Spanish as the null subject information is encoded in the verb (Bedore, 1999). For instance, the English phrase he jumped, can be grammatically stated in Spanish as: (a) él brincó ("he jumped") including the pronoun él ("he"); or (b) as brincó ("[he] jumped") since the pronoun that is being referred to is able to be deduced by context and morphosyntactic agreement of the verb.

SOLUTION: Modified C-units (MC-units), based on rules originally proposed by Gutiérrez-Clellen and Hofstetter (1994) for Terminable Units in Spanish, are an alternative rule for segmentation that is used for language transcripts contained in the Bilingual Spanish/English Reference databases. MC-units are used because they are better able to (a) account for cross-language differences such as pro-drop in Spanish, and (b) facilitate consistency when transcribing language samples in Spanish and English from the same bilingual speaker. Therefore, segmenting utterances as MC-units is recommended in SALT for bilingual (Spanish-English) samples.

MC-units follow two rules. The first rule, like with standard C-unit segmentation, states that an utterance consists of an independent clause and its modifiers, including subordinated clauses. The second rule states that independent clauses that are joined by a coordinating conjunction are segmented as two separate utterances. MC-unit segmentation is illustrated in Figure 7-3. The first row illustrates subordinated clauses in Spanish and English, which are not segmented as two separate utterances. The subordinating conjunction cuando, is used in Spanish; the subordinating conjunction when, is used in English. The second row illustrates coordinated clauses in Spanish and English, which are therefore segmented into two utterances in each language. The coordinating conjunction y, is used in Spanish; the coordinating conjunction and, is used in English.


Understanding utterance segmentation in Spanish:

What is an Utterance? An utterance is a complete thought expressed by an independent (main) clause and any of its subordinated clauses. For example:

  • English: “The boy was running.”

  • Spanish: “El niño estaba corriendo.”

In this example, both sentences are single utterances because they each contain one independent clause. However, sentences can and do often include more than one utterance. One example of this is sentences that contain coordinate structures. For example:

  • English: “The boy was running and grabbed the frog.”

  • Spanish: “El niño estaba corriendo y agarró la rana.”

In this case, there are two independent clauses that are conjoined using a conjunction ('and' in English, or 'y' in Spanish). Each sentence is composed of two utterances because each of the conjoined sentences is a complete statement. For example, in the Spanish example, the two independent utterances are "El niño estaba corriendo" and "(El niño) agarró la rana". In English, the two utterances are "The boy was running" and "(The boy) grabbed the frog". One thing to note is that when utterances are conjoined in either language, and the subject is the same for each utterance, it is acceptable to omit the second subject. In our example, this can be seen in the English and Spanish examples in which there is no explicit subject 'he'/'él' before the words 'grabbed'/'agarró'.

Understanding MC-units: When working with bilingual (Spanish-English) samples, you should use Modified Communication Units (MC-units) for segmentation to account for syntactic differences between the two languages.

Key Rules:

  1. Basic Rule: An utterance is an independent clause with any of its modifiers, including subordinate clauses. 

    • Example:

      • English: “She was happy when she saw the dog.”

      • Spanish: “Ella estaba feliz cuando vio al perro.”

    • In both languages, this is one utterance because the subordinate clause is not a clause that can be grammatical while existing independently.

 

       2. Special Rule for Spanish (Pro-Drop)1: In Spanish, the subject or subject pronoun can be dropped. The subject of the verb can usually be inferred because of person (1st, 2nd, 3rd) and number (singular, plural) agreement on the verb. 

  •  

    • Example:

      • Spanish: “Él brincó y agarró la rana.”

      • If we drop the subject in the first clause: “__ Brincó y agarró la rana.”

 

When working with bilinguals and segmenting coordinate structures in either language, always count them as 2 MC-units.

 

For Monolingual English speakers, reference the following file (under item d.) for rules regarding C-Unit segmentation which differs from the modified C-Units described above

 

For added information, consult the SALT software guide: https://saltsoftware.com/media/wysiwyg/tranaids/CunitSummary.pdf

 

Reglas Adicionales para Transcripciones en Español

(Decisiones tomadas el día 8/28/2024; 9/2/2024)

Additional Rules for Utterance Segmentation:

 

1. Regla del Verbo Auxiliar + Verboide (Gerundio, Participio, Infinitivo)

Cuando falte el verbo auxiliar en el enunciado, pero haya un verboide (como un gerundio, participio, o infinitivo), podemos aplicar el criterio de omisión. Esto significa que podemos separar los enunciados como si el verbo auxiliar estuviera presente, ya que se encuentra implícito en la oración.

Ejemplo:

*PAR: están de picnic una pareja merendando.
*PAR: él ø leyendo.
*PAR: y ella ø haciendo algo.

Si estuviera explícito:

*PAR: están de picnic una pareja merendando.
*PAR: él está leyendo.
*PAR: y ella está haciendo algo.

En este caso, aunque el hablante no mencionó el verbo auxiliar, entendemos que está implícito. Por lo tanto, es seguro separar los enunciados como si el verbo auxiliar estuviera presente.

 

2. Regla de Entonación y Pausas

Al segmentar los enunciados, es fundamental centrarse en la sintaxis. Sin embargo, si la muestra de lenguaje no contiene verbos (como ocurre en algunos pacientes con afasia no fluente), podemos guiarnos por la entonación y las pausas que hace el hablante.

Veamos el siguiente ejemplo de muestra de lenguaje: https://utexas.box.com/s/zwxcszwk3m1s32w30d0mgd7nvco469pe

 

Ejemplo:

*PAR: una bandera.
*PAR: un barco.
*PAR: están haciendo señas.
*PAR: un hombre que pesca.
*PAR: &+u un cubo y una pala.
*PAR: y un &+n niño mm jugando con las olas.

En la muestra, hay una pausa significativa entre "una bandera" y "un barco," lo que nos permite segmentarlos como dos enunciados separados. Por otro lado, "un cubo y una pala" no se separa, ya que fue dicho de manera continua. Esto nos indica que deben considerarse como un solo enunciado.


Esta regla nos ayuda a segmentar correctamente cuando la sintaxis no es suficiente, aprovechando las señales que nos da la entonación y las pausas del hablante.

 

3. Regla de Listas de Objetos Cuando No Hay Verbos y Existen Pausas Muy Marcadas y/o Entonación Muy Marcada

Cuando el hablante describe una lista de objetos sin usar verbos y existen pausas muy marcadas o entonación muy marcada, podemos separar cada objeto como un enunciado independiente. Veamos esta muestra de lenguaje:  https://utexas.box.com/s/18rkfd8bzwtc8mzn5pln801dktvitp71

Ejemplo:

*PAR: un árbol con muchas hojas.
*PAR: un perro.
*PAR: un [/] un niño &+baya eh bañándose.

En este caso, el hablante comenzó con una lista de objetos sin utilizar ningún verbo. Como hay un "cambio de tema" entre cada objeto, podemos separar cada uno como un enunciado distinto. Sin embargo, si el hablante hubiera comenzado su descripción con un verbo como "ver" o "haber," podríamos mantener toda la lista como un solo enunciado en lugar de separarlos. Por ejemplo:

*PAR: veo un árbol con muchas hojas, un perro, un [/] un niño &+baya eh bañándose.
*PAR: hay un árbol con muchas hojas, un perro, un [/] un niño &+baya eh bañándose.

 

4. Regla de la Frase "O Sea"

 

5. Regla de la Palabra "Pero" 

 

6. Regla de Abandono de Enunciados vs Revisiones 

 

7. Regla de Enunciados Largos 

 

8. Regla de "que" o "uno de los cuales

 

9. Regla de la Frase "Por lo Tanto" o “Entonces”

 

10. Regla de la Palabra "Además" 

 

SALT Reference Guide

Relevant sections:

Pg. 343 - Subordination Index

Pg. 352 - Tricky scoring examples

 

 

 

Related pages