Wednesday, August 14, 2019

Replication of the Keyword Extraction part of the paper "Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization

The paper is on vixra at: http://vixra.org/abs/1908.0422
and on arxiv at: https://arxiv.org/abs/1908.07818


The dataset and code associated with the replication can be found at: web.eecs.umich.edu/~lahiri/replication_of_keyword_extraction_part_of_the_paper_by_Chuang_etal_data_and_code.zip (note that a long time has passed since the implementation, and we only have a minimal README at this moment.)

This dataset is on Keyword Extraction (Keyphrase Extraction), based on the SemEval 2010 Task 5 (Keyphrase Extraction) Dataset: https://www.aclweb.org/anthology/S10-1004. We re-annotated the data (144 files) using Amazon Mechanical Turk (MTurk).

Keyword Extraction Dataset
Keyphrase Extraction Dataset
SemEval 2010 Dataset, re-annotated by Amazon Mechanical Turk. - MTurk

Monday, December 10, 2012

Word Sense and Subjectivity

Authors: Janyce Wiebe, Rada Mihalcea



Venue: ACL 2006



Research questions:

1> Can subjectivity labels be assigned to word senses? Yes.

2> Can automatic subjectivity analysis be used to improve word sense disambiguation? Yes.

3> Subjectivity annotation of word senses instead of words, sentences or clauses/phrases.



Methods:

For research question 1>

(a) Agreement between subjectivity ("subjective", "objective", "both", "uncertain") annotators.

(b) Designing a "subjectivity score" for WordNet synsets.

For research question 2>

The output of a subjectivity sentence classifier is input to a word-sense disambiguation system, which is in turn evaluated on the nouns from the SENSEVAL-3 English lexical sample task.



Background:

1> Subjective expressions are of three types:

(a) references to private states

(b) references to speech (or writing) events expressing private states

(c) expressive subjective elements

2> Subjectivity analysis:

(a) identifying subjective words and phrases

(b) subjectivity classification of sentences, words or phrases/clauses in context

(c) applications: review classification, text mining for product reviews, summarization, information extraction, question answering



Inter-annotator agreement:

Judge 1 (a co-author) tagged 354 synsets (64 words). Judge 2 (not a co-author) tagged 138 synsets (32 words) independently.

Overall agreement 85.5%, kappa value 0.74. Authors tend to highlight high agreement values and kappa values.

Causes of uncertainty:

(a) subjective senses are missing in the dictionary

(b) the hypernym may have subjective senses that meddle with the current synset



Subjectivity scoring:

1> Find distributionally similar words (DSW).

2> Determine similarity of a word-sense with each DSW. Let us call it "sim". So, for k word senses and p DSWs, we get a k-by-p matrix of "sim" scores.

3> Whenever a DSW appears in a subjective context (in MPQA corpus), we add its "sim" score to the subjectivity score, "subj". Whenever a DSW appears in a non-subjective context, we subtract its "sim" score from "subj".

And irrespective of subjective/non-subjective context, we add "sim" score to total subjectivity score, "totsubj".

4> Do step 3 for all DSWs.

5> Divide "subj" by "totsubj" to obtain final subjectivity score for a word sense.



Subjectivity scoring evaluation:

On 354 word senses.

For each sense, a subjectivity score is determined first. The score is thresholded to obtain a "subjectivity label" (+1, -1). Different thresholds tried, based on precision and recall.

Informed random baseline - fixed precision, max recall one.

DSW choice - similarity_all, similarity_selected.

Criteria - precision, recall, break-even point (where precision == recall).



Subjectivity for WSD:

Make an existing WSD system subjectivity-aware (by assigning subjectivity scores to sentences containing ambiguous words), and compare its performance with that of the original system.

Hypothesis: instances of subjective senses are more likely to be in subjective sentences.



Data/corpus:

MPQA

SENSEVAL-3



Open research questions:

1> Opposite of research question 2. Can word sense disambiguation help automatic subjectivity analysis?

2> Assign subjectivity labels to WordNet entries, thereby helping subjectivity-aware word search and travels along "subjectivity trails".



Cited papers that might be relevant to my research:

1> D. McCarthy, R. Koeling, J. Weeds, and J. Carroll. 2004. Finding predominant senses in untagged text. In Proc. ACL 2004.

2> D. Lin. 1998. Automatic retrieval and clustering of similar words. In Proceedings of COLING-ACL, Montreal, Canada.

3> J. Jiang and D. Conrath. 1997. Semantic similarity based on corpus statistics and lexical tax onomy. In Proceedings of the International Conference on Research in Computational Linguistics, Taiwan.

4> J. Wiebe. 2000. Learning subjective adjectives from corpora. In Proc. AAAI 2000.

5> Determining the sentiment of opinions. COLING 2004.

6> Words with attitude. Kamps and Marx. 2002.

7> Mining and summarizing customer reviews. KDD 2004.

8> Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. EMNLP 2003.

9> Determining the semantic orientation of terms through gloss analysis. CIKM 2005.



Citing papers that might be relevant to my research:

-- Title of the paper
-- Authors
-- Topic



Bibtex entry:

@inproceedings{Wiebe:2006:WSS:1220175.1220309,
author = {Wiebe, Janyce and Mihalcea, Rada},
title = {Word sense and subjectivity},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics},
series = {ACL-44},
year = {2006},
location = {Sydney, Australia},
pages = {1065--1072},
numpages = {8},
url = {http://dx.doi.org/10.3115/1220175.1220309},
doi = {http://dx.doi.org/10.3115/1220175.1220309},
acmid = {1220309},
publisher = {Association for Computational Linguistics},
address = {Stroudsburg, PA, USA},
}

To read

1> other papers on formality (e.g., lexical formality), machine learning, risi kondor, leskovec, topic models, crfs, markov random fields, machine reading, markov logic networks, time series, eamonn keogh, topic detection and tracking



2> Recursive descent from different papers



3> Recursive ascent from different papers



4> acl anthology



5> conferences to read - icml, nips, colt, alt, soda, stoc, focs, icdm, icde, ecml-pkdd, kdd, www, sigir, wsdm, cikm, acl, coling, emnlp, ijcnlp, hcomp, sdm, sigmod, aaai, ijcai, uai, icann, naacl, jcdl, aistats



6> journals - machine learning, jmlr, Neural Computation, Computational Linguistics, data mining and knowledge discovery, information retrieval journal, international journal of digital libraries, artificial intelligence, journal of discrete algorithms, journal of natural language engineering, journal of natural language processing



7> google "corpus linguistics conference" and "corpus linguistics journal"



8> google and gscholar "summarization", "document summarization", "multi-document summarization", "opinion mining", "sentiment analysis", "subjectivity", "readability", "discourse", "recommender systems", "recommendation systems", "reputation networks", "trust networks", "opinion evolution", "sentiment evolution", "opinion tracking", "sentiment tracking"

A Book: Opinion Mining and Sentiment Analysis

1> Query classification was the subject of KDD Cup 2005. TREC 2006 blog track dealt with whether texts were opinionated or not; and if they were, which portions were opinionated.

2> Section 1.4 is important for finding further papers and related work.

3> "Opinion mining" and "Sentiment analysis" are comparable (though not fully equivalent) terms; while the former was more popular with WWW community, the latter was more popular with ACL (NLP) community.

4> Similarity with spam detection. Humans are less apt at tagging sentiments/opinions correctly.

5> Binary feature vectors are more influential in opinion mining than numerical (e.g., tfidf type) feature vectors.

6> Markedly different from topic modeling or IR. Hapax legomena are extremely important here, while repeated occurrence of a word (or a group of words) is not.

7> Topic-based summarization vs opinion-based summarization: In the former case, the first few sentences of a document are generally best summarizers. In the latter case, the last few sentences of a document were found to be the best summarizers.

8> One study has found that there is a real economic effect to be observed when factoring in reviewer credibility: Gu et al. [114] note that a weighted average of message-board postings in which poster credibility is factored in has “prediction power over future abnormal returns of the stock”, but if postings are weighted uniformly, the predictive power disappears.

Note the similar finding in Sumit Bhatia's work on online threads. There are information seekers (equivalent to low quality reviewers) and information providers (equivalent to high quality reviewers).



Triggered research questions:

1> Can reinforcement learning be used for opinion mining/sentiment analysis? Are there algorithms of this kind? What I allude to is that as more and more data comes in, it might be possible to "refine" our earlier opinions in some way - which is a traditional reinforcement learning tactic.

Note that it is also much like a simulated annealing/genetic algorithm/evolutionary computation type thing. Are there algorithms that employ these to solve opinion mining/sentiment analysis problems? Search in the book.

Also note "prior polarity" and "contextual polarity" in 320.

2> Are there online/streaming algorithms for opinion mining?

3> HCI-type evaluation of the graphical summaries ("graphical summary interface")

4> Reviewing the reviews: review quality assessment - can formality be used/introduced as a user-review-quality-independent measure? Read 19, 99, 161, 329, 106, 193, 262, 161 for previous work.

We can also pose it as an exploratory analysis paper: "Formality of product reviews" or "Information content of product reviews". Has there been prior work along these lines? Check. 106, 107, 161 and 329 are already worth looking at. Also check section 5.2.4.2 (and all its cited papers) for the research question "formality and reviewer-credibility".

Note that our "implicature score" may also come in handy for measuring review quality.

5> A new feature - "bag of POS". Check prior work, if any.

6> How about jointly plotting and/or modeling the temporal trend of sales and opinions? Any prior work?



Reported corpora:

Blog06, BlogPulse, Congressional floor debate transcripts, Cornell movie review datasets, customer review datasets, Economining, French sentences, MPQA corpus, multiple-aspect restaurant reviews, multi-domain sentiment dataset, NTCIR multilingual corpus, review-search result sets, OpQA corpus



Reported lexica:

General inquirer, OpinionFinder’s Subjectivity Lexicon, SentiWordnet, Taboada and Grieve’s Turney adjective list



More labels/tags for this post:

winner circle bias, sock puppet, sock puppetry, hedonic regression



Bibtex entry:

@article{Pang+Lee:08b,
author = {Bo Pang and Lillian Lee},
title = {Opinion mining and sentiment analysis},
journal = {Foundations and Trends in Information Retrieval},
year = {2008},
volume = {2},
number = {1-2},
pages = {1--135}
}



Interesting papers for further reading:

1> Learning to laugh (automatically): Computational models for humor recognition

2> Word Sense and Subjectivity by Wiebe and Mihalcea

3> Learning Subjective Language by Wiebe, et al

4> Thumbs up? Sentiment Classification using Machine Learning Techniques

5> A. Anagnostopoulos, A. Z. Broder, and D. Carmel, “Sampling search-engine results,” World Wide Web, vol. 9, pp. 397–429, 2006.

6> Feature engineering for text classification

7a> Isotonic Conditional Random Fields and Local Sentiment Flow

7b> Generalized Isotonic Conditional Random Fields

8> Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 347–354, 2005.

9> Theresa Wilson, Janyce Wiebe, and Rebecca Hwa. Just how mad are you? Finding strong and weak opinion clauses. In Proceedings of AAAI, pages 761–769, 2004. Extended version in Computational Intelligence 22(2, Special Issue on Sentiment Analysis):73–99, 2006.

10> Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.

11> Vasileios Hatzivassiloglou and Janyce Wiebe. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the International Conference on Computational Linguistics (COLING), 2000.

12> Peter Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the Association for Computational Linguistics (ACL), pages 417–424, 2002.

13> Hong Yu and Vasileios Hatzivassiloglou. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2003.

14> Ann Devitt and Khurshid Ahmad. Sentiment analysis in financial news: A cohesion-based approach. In Proceedings of the Association for Computational Linguistics (ACL), pages 984–991, 2007.

15> Koji Eguchi and Victor Lavrenko. Sentiment retrieval using generative models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 345–354, 2006.

16> Wei-Hao Lin and Alexander Hauptmann. Are these documents written from different perspectives? A test of different perspectives based on statistical distribution divergence. In Proceedings of the International Conference on Computational Linguistics (COLING)/Proceedings of the Association for Computational Linguistics (ACL), pages 1057–1064, Sydney, Australia, July 2006. Association for Computational Linguistics.

17> Claire Cardie, JanyceWiebe, TheresaWilson, and Diane Litman. Combining low-level and summary representations of opinions for multi-perspective question answering. In Proceedings of the AAAI Spring Symposium on New Directions in Question Answering, pages 20–27, 2003.

18> Giuseppe Carenini, Raymond Ng, and Adam Pauls. Multi-document summarization of evaluative text. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL), pages 305–312, 2006.

19> Claire Cardie. Empirical methods in information extraction. AI Magazine, 18(4):65–79, 1997.

20> Luca Dini and Giampaolo Mazzini. Opinion classification through information extraction. In Proceedings of the Conference on Data Mining Methods and Databases for Engineering, Finance and Other Fields (Data Mining), pages 299–310, 2002.

21> Judee K. Burgoon, J. P. Blair, Tiantian Qin, and Jay F. Nunamaker, Jr. Detecting deception through linguistic analysis. In Proceedings of Intelligence and Security Informatics (ISI), number 2665 in Lecture Notes in Computer Science, page 958, 2008.

22> Veselin Stoyanov, Claire Cardie, Diane Litman, and Janyce Wiebe. Evaluating an opinion annotation scheme using a new multi-perspective question and answer corpus. In Qu et al. [245]. AAAI technical report SS-04-07.

23> Janyce Wiebe, Theresa Wilson, and Claire Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation (formerly Computers and the Humanities), 39(2/3): 164–210, 2005.

24> Janyce M. Wiebe, Rebecca F. Bruce, and Thomas P. O’Hara. Development and use of a gold standard data set for subjectivity classifications. In Proceedings of the Association for Computational Linguistics (ACL), pages 246–253, 1999.

25a> Nitin Jindal and Bing Liu. Identifying comparative sentences in text documents. In Proceedings of the ACM Special Interest Group on Information Retrieval (SIGIR), 2006. [This is the longer and more comprehensive version.]

25b> Nitin Jindal and Bing Liu. Mining comparative sentences and relations. In Proceedings of AAAI, 2006.

26> Eric Breck and Claire Cardie. Playing the telephone game: Determining the hierarchical structure of perspective and speech expressions. In Proceedings of the International Conference on Computational Linguistics (COLING), 2004.

27> Jeonghee Yi, Tetsuya Nasukawa, Razvan Bunescu, and Wayne Niblack. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of the IEEE International Conference on Data Mining (ICDM), 2003.

28> Minqing Hu and Bing Liu. Mining opinion features in customer reviews. In Proceedings of AAAI, pages 755–760, 2004.

29> Christian Jacquemin. Spotting and Discovering Terms through Natural Language Processing. MIT Press, 2001.

30> Rayid Ghani, Katharina Probst, Yan Liu, Marko Krema, and Andrew Fano. Text mining for product attribute extraction. SIGKDD Explorations Newsletter, 8(1):41–48, 2006.

31> Ana-Maria Popescu and Oren Etzioni. Extracting product features and opinions from reviews. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005.

32> Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, and Toshikazu Fukushima. Mining product reputations on the web. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 341–349, 2002. Industry track.

33> Tony Mullen and Robert Malouf. A preliminary investigation into sentiment analysis of informal political discourse. In AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAICAAW), pages 159–162, 2006.

34> Tony Mullen and Robert Malouf. Taking sides: User classification for informal online political discourse. Internet Research, 18:177–190, 2008.

35> Yejin Choi, Claire Cardie, Ellen Riloff, and Siddharth Patwardhan. Identifying sources of opinions with conditional random fields and extraction patterns. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005.

36> Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou, and Dan Jurafsky. Automatic extraction of opinion propositions and their holders. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text, 2004.

37> Yejin Choi, Eric Breck, and Claire Cardie. Joint extraction of entities and relations for opinion recognition. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2006.

38> Dan Roth and Wen Yih. Probabilistic reasoning for entity and relation recognition. In Proceedings of the International Conference on Computational Linguistics (COLING), 2004.

39> Soo-Min Kim and Eduard Hovy. Identifying and analyzing judgment opinions. In Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL), 2006.

40> Veselin Stoyanov and Claire Cardie. Partially supervised coreference resolution for opinion summarization through structured rule learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 336–344, Sydney, Australia, July 2006. Association for Computational Linguistics.

41> Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the Association for Computational Linguistics (ACL), pages 271–278, 2004.

42> Shaikh Mostafa Al Masum, Helmut Prendinger, and Mitsuru Ishizuka. SenseNet: A linguistic tool to visualize numerical-valence based sentiment of textual data. In Proceedings of the International Conference on Natural Language Processing (ICON), pages 147–152, 2007. Poster.

43a> Michael White, Claire Cardie, and Vincent Ng. Detecting discrepancies in numeric estimates using multidocument hypertext summaries. In Proceedings of the Conference on Human Language Technology, pages 336–341, 2002.

43b> Michael White, Claire Cardie, Vincent Ng, KiriWagstaff, and Daryl McCullough. Detecting discrepancies and improving intelligibility: Two preliminary evaluations of RIPTIDES. In Proceedings of the Document Understanding Conference (DUC), 2001.

44> Ehud Reiter and Robert Dale. Building Natural Language Generation Systems. Cambridge, 2000.

45> Lun-Wei Ku, Yu-Ting Liang, and Hsin-Hsi Chen. Opinion extraction, summarization and tracking in news and blog corpora. In AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW), pages 100–107, 2006.

46> All LREC papers

47> Yukiko Kawai, Tadahiko Kumamoto, and Katsumi Tanaka. Fair News Reader: Recommending news articles with different sentiments based on user preference. In Proceedings of Knowledge-Based Intelligent Information and Engineering Systems (KES), number 4692 in Lecture Notes in Computer Science, pages 612–622, 2007.

48> Xiaodan Song, Yun Chi, Koji Hino, and Belle Tseng. Identifying opinion leaders in the blogosphere. In Proceedings of the ACM SIGIR Conference on Information and Knowledge Management (CIKM), pages 971–974, 2007.

49> Meishan Hu, Aixin Sun, and Ee-Peng Lim. Comments-oriented blog summarization by sentence extraction. In Proceedings of the ACM SIGIR Conference on Information and Knowledge Management (CIKM), pages 901–904, 2007. ISBN 978-1-59593-803-9. Poster paper.

50> Stephen Wan and Kathy McKeown. Generating overview summaries of ongoing email thread discussions. In Proceedings of the International Conference on Computational Linguistics (COLING), pages 549–555, Geneva, Switzerland, 2004.

51> Liang Zhou and Eduard Hovy. On the summarization of dynamically introduced information: Online discussions and blogs. In AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW), pages 237–242, 2006.

52> Minqing Hu and Bing Liu. Mining and summarizing customer reviews. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 168–177, 2004.

53> Li Zhuang, Feng Jing, Xiao-yan Zhu, and Lei Zhang. Movie review mining and summarization. In Proceedings of the ACM SIGIR Conference on Information and Knowledge Management (CIKM), 2006.

54> Nan Hu, Paul A. Pavlou, and Jennifer Zhang. Can online reviews reveal a product’s true quality?: empirical findings and analytical modeling of online word-of-mouth communication. In Proceedings of Electronic Commerce (EC), pages 324–330, New York, NY, USA, 2006. ACM.

55> Lu´ıs Cabral and Ali Hortac¸su. The dynamics of seller reputation: Theory and evidence from eBay. Working paper, downloaded version revised in March, 2006. URL http://pages.stern.nyu.edu/˜lcabral/workingpapers/CabralHortacsu_Mar06.pdf.

56> Pero Subasic and Alison Huettner. Affect analysis of text using fuzzy semantic typing. IEEE Transactions on Fuzzy Systems, 9(4):483–496, 2001.

57> James Allan. Introduction to topic detection and tracking. In James Allan, editor, Topic detection and tracking: Event-based information organization, pages 1–16, Norwell, MA, USA, 2002. Kluwer Academic Publishers. ISBN 0-7923-7664-1.

58> Gilad Mishne and Maarten de Rijke. Moodviews: Tools for blog mood analysis. In AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW), pages 153–154, 2006.

59> Baruch Awerbuch and Robert Kleinberg. Competitive collaborative learning. In Proceedings of the Conference on Learning Theory (COLT), pages 233–248, 2005. Journal version to appear in Journal of Computer and System Sciences, special issue on computational learning theory.

60> Benjamin Snyder and Regina Barzilay. Multiple aspect ranking using the Good Grief algorithm. In Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL), pages 300–307, 2007.

61> Andrea Esuli and Fabrizio Sebastiani. Determining term subjectivity and term orientation for opinion mining. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL), 2006.

Title of the paper

Authors:



Venue:



Research question:



Data/corpus:



Algorithms/methods:



Findings/conclusions:

-- What experiments were performed? Why?
-- What metrics were used/reported? Why were those metrics chosen?
-- How did the metrics vary/change over time? How did they vary across different algorithms/methods? How did they vary across datasets?
-- How does that support/not support/refute the research claims of the authors?



Open research questions:



Related work:

-- Title of the paper
-- Authors
-- Topic



Cited papers that might be relevant to my research:

-- Title of the paper
-- Authors
-- Topic



Citing papers that might be relevant to my research:

-- Title of the paper
-- Authors
-- Topic



Bibtex entry:



More labels/tags for this post:

Formality of language: definition, measurement, behavioral determinants

Authors: Francis Heylighen, Jean Marc Dewaele



Research question: quantifying formality



Data/corpus:

1> Two speech styles and one written style: (a) informal conversation among students, (b) oral examination, (c) essay produced in a written test

2> Frequency dictionaries of Italian and Dutch (for measuring frequencies of deictic and non-deictic words)

3> French interlanguage data



Findings/conclusions:

1> The most fundamental purpose of language production is communication: making oneself understood by someone else.

2> Surface formality - formality for its own sake

3> Deep formality - formality to express meaning more clearly and completely

4> There is a close parallel between natural languages and artificial (e.g., programming) languages. The reason programming languages are known as "formal languages" is that they all show a VERY high degree of (deep) formality. So, natural languages, the authors posit, will also show programming-language-type formality, if formalized ad nauseam.

[Note: A very similar idea is Lotfi Zadeh's PNL (precisiated natural language). "Formalization" is equivalent to Zadeh's "precisiation".]

5> Fuzziness - situation where the reference of an expression is not unambiguously determined (e.g., "It is hot" (how hot?) or "I am in love" (love or infatuation or fling?))

[Note: See Fuzzy Logic and Fuzzy Set Theory]

6> Expressions can be both fuzzy and context-dependent (a "tall" building in NYC is not the same as a "tall" building in State College). In fact, it is difficult to clearly separate fuzziness and context-dependence.

7> Formal styles tend to avoid not only context-dependent expressions, but also
fuzzy ones. In practice, formal speakers will tend to choose the least fuzzy expressions that can be applied without too much effort. But since the information necessary to resolve fuzziness is by definition not completely under the control of the communicator, while the information specifying the context is, we should expect much more variation between formal and informal styles on the level of contextuality than on the level of fuzziness.

8> Spectrum of fuzziness and context-dependence. High fuzziness, low context-dependence - politician's speech. Low fuzziness, high context-dependence - poetry.

Variation along the expressivity axis is less natural in the sense that it will always to some degree flout Grice's (1975) maxims of informativeness and avoidance of ambiguity, in the case of poetry in order to create unique artistic effects, in the case of the politician beating around the bush in order to simply avoid communication.

9> More formal messages have less chance to be misinterpreted by others who do not share the same context as the sender. This is clearly exemplified by written language, where there is no direct contact between sender and receiver, and hence a much smaller sharing of context than in speech. We should thus expect written language in general to be more formal than spoken language. The definition also implies that validity or comprehensibility of formal messages will extend over wider contexts: more people, longer time spans, more diverse circumstances, etc.

10> Formality is rigid; meanings don't shift. Informality is flexible; meanings can shift over time, place, person or discourse.

11> Formal style is detached, impersonal, less direct and more objective. Informal style is interactive and more involved.

12> Time ("now", "then"), place ("here", "there"), person ("he", "she") and discourse deixis ("yes", "no", "notwithstanding", "therefore", "however"). Other examples of discourse deixis are anaphora and interjections.

13> Nouns, adjectives, articles and prepositions are non-deictic. Pronouns, adverbs, verbs and interjections are deictic. Conjunctions are deixis-neutral.

14> F = (noun frequency + adjective freq. + preposition freq. + article freq. – pronoun freq. – verb freq. – adverb freq. – interjection freq. + 100)/2

The frequencies are actually percentages of the number of words belonging to a particular category.

15> PCA on word frequencies (actually proportions) yielded formality as the most important dimension of writing style variation.



Psychology-related findings:

1> Formality depends on situation. Formality will be highest in those
situations where accurate understanding is essential, such as contracts, laws, or
international treaties. Second, formality will be higher when correct interpretation is more difficult to achieve. This is the case with feedback-less conversations. For example, phone conversations are more formal than face-to-face conversations, and mails are less formal than books or articles.

2> Expression (E) + Context (C) -> Interpretation (I)

The more is C, the less formal E can be. The less is C, the more formal E is.

For example, the larger the difference (or distance) between two communicators (in terms of psychology, culture, age, class, social rank, nationality or education), the more formal their communication will be.

People who are psychologically close, such as siblings, spouses or intimate friends, will tend to be minimally formal in their exchanges. We would venture that the highest degree of informality will be found among identical twins that were raised together, who completely share their cultural, social and even biological backgrounds.

3> Audience size. All other things being equal, the larger the audience, the less the different receivers and the sender will have in common, and thus the smaller the shared context. => Higher formality.

Moreover, the larger the audience, in general, the more important it will be to secure accurate understanding. Therefore, we may expect that speeches or texts directed to a large audience will be more formal than comments addressed to one or a few persons.

4> The longer the time span between sending and receiving, the less will remain of the original context in which the expression was produced. For example, reports written for archiving purposes will be more formal than notes taken to remember tomorrow’s agenda. This may also in part explain why spontaneous speeches, produced on the spot, have a much lower formality than speeches prepared at an earlier moment. Another way to test this proposition empirically might consist in measuring the formality of messages sent through fast media (e.g. fax or electronic mail) versus slow media (e.g. postal mail). A message that can be expected to reach the addressee the same day should on average be less formal than a message that takes several days to get through.

5> Finally, the factor of discourse deixis suggests that formality would be higher at the beginning of a conversation or text, because there is not any previous discourse to refer to as yet. Testing this hypothesis is straightforward: it suffices to collect a range of opening sentences or opening paragraphs from articles, speeches or conversations and compare their average formality with the formality of sentences from the middle of the same language sample.

6> Gender. Women’s speech is more formal in the "surface" sense, but less formal in the "deep" sense. It appears that women tend in general to be more intimate or involved in conversations ("rapport talk"), whereas men remain more distant or detached towards their conversation partners ("report talk").

7> Introversion. Introverts use higher (deep) formality, extroverts use lower (deep) formality.

8> Level of education. Highly educated people use more deeply formal language than less highly educated people.



Open research questions:

1> Is evoked contextuality (deictic and anaphoric) a good measure of overall contextuality, and thus of formality?

2> Instead of PCA, how about doing an LDA on word frequencies? Or maybe a manifold regularization (or kernel method)? Will that impart additional insight into formality (or any other hitherto unknown dimension of style variation)?

3> Extension of F-score. As we know, F-score works well across different languages. Does it work well for Bengali as well? We can do a PCA on Bengali word (POS) frequencies, and see if nouns, adjectives and prepositions get positive loadings, whereas verbs, adverbs and interjections get negative loadings.



Related work:

1> Hasan, R. (1984) Ways of saying: ways of meaning. in: R. P. Fawcett, M.A.K. Halliday, S.M. Lamb, A. Makkai (eds.), The semiotics of Culture and Language. Vol. 1 Language as Social Semiotic (pp. 105-162) London & Dover: Pinter.

Explicit and implicit styles

2> Lexical density



Other related work:

1> Klir, G. & Folger, T. (1987) Fuzzy Sets, Uncertainty, and Information. Prentice Hall, Englewood Cliffs, NJ.

2> van Brakel, J. (1992) The Complete Description of the Frame Problem, Psycoloquy 3 (60) frameproblem 2.

3> Givón, T. Function, structure and language acquisition, in: The crosslinguistic study of language acquisition: Vol. 1, D.I. Slobin (ed.), Hillsdale, Lawrence Erlbaum, 1008-1025.

4> Leckie-Tarry, H. (1995) Language and context. A functional linguistic theory of register. (edited by David Birch), London-New York: Pinter.

5> Halliday, M.A.K. (1985) Spoken and written language. Oxford: Oxford University Press.

6> Uit Den Boogaert, P.C. (1975) Woordfrekwenties. In geschreven en gesproken Nederlands. Oosthoek, Scheltema & Holkema, Utrecht.

7> Besnier, N. (1988) The Linguistic Relationships of Spoken and Written Nukulaelae. Language 64, 707-736.

8> Biber, D. & Hared, M. (1992) Dimensions of Register Variation in Somali. Language Variation and Change, 4, 41-75.



More labels/tags for this post: context-dependent, context, anaphora, formality continuum, expressivity, observer's paradox, linguistic complementarity principle, HyperTalk, nominalization, verbalization



Bibtex entry:

@TECHREPORT{Heylighen99formalityof,
author = {Francis Heylighen and Jean-marc Dewaele},
title = {Formality of Language: definition, measurement and behavioral determinants},
institution = {},
year = {1999}
}