Publications

Though preceding work in computational argument quality (AQ) mostly focuses on assessing overall AQ, researchers agree that writers would benefit from feedback targeting individual dimensions of argumentation theory. However, a large-scale theory-based corpus and corresponding computational models are missing. We fill this gap by conducting an extensive analysis covering three diverse domains of online argumentative writing and presenting GAQCorpus: the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) corpus annotated with theory-based AQ scores. We then propose the first computational approaches to theory-based assessment, which can serve as strong baselines for future work. We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between dimensions yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment.
copy
@inproceedings{lauscher-etal-2020-rhetoric,
    title = "Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing",
    author = "Lauscher, Anne  and
      Ng, Lily  and
      Napoles, Courtney  and
      Tetreault, Joel",
    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (Online)",
    publisher = "International Committee on Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.coling-main.402",
    doi = "10.18653/v1/2020.coling-main.402",
    pages = "4563--4574"}
Computational models of argument quality (AQ) have focused primarily on assessing the overall quality or just one specific characteristic of an argument, such as its convincingness or its clarity. However, previous work has claimed that assessment based on theoretical dimensions of argumentation could benefit writers, but developing such models has been limited by the lack of annotated data. In this work, we describe GAQCorpus, the first large, domain-diverse annotated corpus of theory-based AQ. We discuss how we designed the annotation task to reliably collect a large number of judgments with crowdsourcing, formulating theory-based guidelines that helped make subjective judgments of AQ more objective. We demonstrate how to identify arguments and adapt the annotation task for three diverse domains. Our work will inform research on theory-based argumentation annotation and enable the creation of more diverse corpora to support computational AQ assessment.
copy
@inproceedings{ng-etal-2020-creating,
    title = "Creating a Domain-diverse Corpus for Theory-based Argument Quality Assessment",
    author = "Ng, Lily  and
      Lauscher, Anne  and
      Tetreault, Joel  and
      Napoles, Courtney",
    booktitle = "Proceedings of the 7th Workshop on Argument Mining",
    month = dec,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.argmining-1.13",
    pages = "117--126"}
    
Until now, grammatical error correction (GEC) has been primarily evaluated on text written by non-native English speakers, with a focus on student essays. This paper enables GEC development on text written by native speakers by providing a new data set and metric. We present a multiple-reference test corpus for GEC that includes 4,000 sentences in two new domains (formal and informal writing by native English speakers) and 2,000 sentences from a diverse set of non-native student writing. We also collect human judgments of several GEC systems on this new test set and perform a meta-evaluation, assessing how reliable automatic metrics are across these domains. We find that commonly used GEC metrics have inconsistent performance across domains, and therefore we propose a new ensemble metric that is robust on all three domains of text.
copy
@article{napoles-EtAl:2019:TACL,
    author  = {Napoles, Courtney and Nadejde, Maria and Tetreault, Joel},
    title   = {Enabling Robust Grammatical Error Correction in New Domains: Data Sets, Metrics, and Analyses},
    journal = {Transactions of the Association for Computational Linguistics},
    volume  = {7},
    pages   = {551--566},
    year    = {2019},
    doi     = {10.1162/tacl_a_00282},
    URL     = {https://doi.org/10.1162/tacl_a_00282},
}
Run-on sentences are common grammatical mistakes but little research has tackled this problem to date. This work introduces two machine learning models to correct run-on sentences that outperform leading methods for related tasks, punctuation restoration and wholesentence grammatical error correction. Due to the limited annotated data for this error, we experiment with artificially generating training data from clean newswire text. Our findings suggest artificial training data is viable for this task. We discuss implications for correcting run-ons and other types of mistakes that have low coverage in error-annotated corpora.
copy
@InProceedings{zheng-napoles-tetreault:2018:W-NUT2018,
    author    = {Zheng, Junchao  and  Napoles, Courtney  and  Tetreault, Joel},
    title     = {How do you correct run-on sentences it's not as easy as it seems},
    booktitle = {Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text},
    month     = {November},
    year      = {2018},
    address   = {Brussels, Belgium},
    publisher = {Association for Computational Linguistics},
    pages     = {33--38},
    url       = {http://www.aclweb.org/anthology/W18-6105}
}
In this work we adapt machine translation (MT) to grammatical error correction, identifying how components of the statistical MT pipeline can be modified for this task and analyzing how each modification impacts system performance. We evaluate the contribution of each of these components with standard evaluation metrics and automatically characterize the morphological and lexical transformations made in system output. Our model rivals the current state of the art using a fraction of the training data.
copy
@InProceedings{napoles-callisonburch:2017:BEA,
    author    = {Napoles, Courtney  and  Callison-Burch, Chris},
    title     = {Systematically Adapting Machine Translation for Grammatical Error Correction},
    booktitle = {Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications},
    month     = {September},
    year      = {2017},
    address   = {Copenhagen, Denmark},
    publisher = {Association for Computational Linguistics},
    pages     = {345--356},
    url       = {http://www.aclweb.org/anthology/W17-5039}
}
The field of grammatical error correction (GEC) has made tremendous bounds in the last ten years, but new questions and obstacles are revealing themselves. In this position paper, we discuss the issues that need to be addressed and provide recommendations for the field to continue to make progress, and propose a new shared task. We invite suggestions and critiques from the audience to make the new shared task a community-driven venture.
copy
@InProceedings{sakaguchi-napoles-tetreault:2017:BEA,
    author    = {Sakaguchi, Keisuke  and  Napoles, Courtney  and  Tetreault, Joel},
    title     = {GEC into the future: Where are we going and how do we get there?},
    booktitle = {Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications},
    month     = {September},
    year      = {2017},
    address   = {Copenhagen, Denmark},
    publisher = {Association for Computational Linguistics},
    pages     = {180--187},
    url       = {http://www.aclweb.org/anthology/W17-5019}
}
Online news platforms curate high-quality content for their readers and, in many cases, users can post comments in response. While comment threads routinely contain unproductive banter, insults, or users shouting" over each other, there are often good discussions buried among the noise. In this paper, we define a new task of identifying "good" conversations, which we call ERICs—Engaging, Respectful, and/or Informative Conversations. Our model successfully identifies ERICs posted in response to online news articles with F1 = 0.73 and F1 = 0.91 in debate forums.
copy
@inproceedings{napoles2017automatically,
    title     = {Automatically Identifying Good Conversations Online (Yes, They Do Exist!)},
    author    = {Napoles, Courtney and Pappu, Aasish and Tetreault, Joel},
    booktitle = {Eleventh International AAAI Conference on Web and Social Media},
    year      = {2017}
}
This work presents a dataset and annotation scheme for the new task of identifying "good" conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse.
copy
@InProceedings{napoles-EtAl:2017:LAW,
    author    = {Napoles, Courtney  and  Tetreault, Joel  and  Pappu, Aasish  and  Rosato, Enrica  and  Provenzale, Brian},
    title     = {Finding Good Conversations Online: {The Yahoo News Annotated Comments Corpus}},
    booktitle = {Proceedings of the 11th Linguistic Annotation Workshop},
    month     = {April},
    year      = {2017},
    address   = {Valencia, Spain},
    publisher = {Association for Computational Linguistics},
    pages     = {13--23},
    url       = {http://www.aclweb.org/anthology/W17-0802}
}
We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.
copy
@InProceedings{napoles-sakaguchi-tetreault:2017:EACLshort,
    author    = {Napoles, Courtney  and  Sakaguchi, Keisuke  and  Tetreault, Joel},
    title     = {{JFLEG}: A Fluency Corpus and Benchmark for Grammatical Error Correction},
    booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers},
    month     = {April},
    year      = {2017},
    address   = {Valencia, Spain},
    publisher = {Association for Computational Linguistics},
    pages     = {229--234},
    url       = {http://www.aclweb.org/anthology/E17-2037}
}
Current methods for automatically evaluating grammatical error correction (GEC) systems rely on gold-standard references. However, these methods suffer from penalizing grammatical edits that are correct but not in the gold standard. We show that reference-less grammaticality metrics correlate very strongly with human judgments and are competitive with the leading reference-based evaluation metrics. By interpolating both methods, we achieve state-of-the-art correlation with human judgments. Finally, we show that GEC metrics are much more reliable when they are calculated at the sentence level instead of the corpus level. We have set up a CodaLab site for benchmarking GEC output using a common dataset and different evaluation metrics.
copy
@InProceedings{napoles-sakaguchi-tetreault:2016:EMNLP2016,
    author    = {Napoles, Courtney  and  Sakaguchi, Keisuke  and  Tetreault, Joel},
    title     = {There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction},
    booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
    month     = {November},
    year      = {2016},
    address   = {Austin, Texas},
    publisher = {Association for Computational Linguistics},
    pages     = {2109--2115},
    url       = {https://aclweb.org/anthology/D16-1228}
}
The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. One unvisited assumption, however, is the reliance of GEC evaluation on error-coded corpora, which contain specific labeled corrections. We examine current practices and show that GEC's reliance on such corpora unnaturally constrains annotation and automatic evaluation, resulting in (a) sentences that do not sound acceptable to native speakers and (b) system rankings that do not correlate with human judgments. In light of this, we propose an alternate approach that jettisons costly error coding in favor of unannotated, whole-sentence rewrites. We compare the performance of existing metrics over different gold-standard annotations, and show that automatic evaluation with our new annotation scheme has very strong correlation with expert rankings (rho = 0.82). As a result, we advocate for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency.
copy
@article{tacl-gec-eval-2016,
    author   = {Sakaguchi, Keisuke  and Napoles, Courtney  and Post, Matt and Tetreault, Joel },
    title    = {Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality},
    journal  = {Transactions of the Association for Computational Linguistics},
    volume   = {4},
    year     = {2016},
    issn     = {2307-387X},
    url      = {https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/800},
    pages    = {169--182}
}
The Automated Evaluation of Scientific Writing, or AESW, is the task of identifying sentences in need of correction to ensure their appropriateness in a scientific prose. The data set comes from a professional editing company, VTeX, with two aligned versions of the same text—before and after editing—and covers a variety of textual infelicities that proofreaders have edited. While previous shared tasks focused solely on grammatical errors (Dale and Kilgarriff, 2011; Dale et al., 2012; Ng et al., 2013; Ng et al., 2014), this time edits cover other types of linguistic misfits as well, including those that almost certainly could be interpreted as style issues and similar “matters of opinion”. The latter arise because of different language editing traditions, experience, and the absence of uniform agreement on what “good” scientific language should look like. Initiating this task, we expected the participating teams to help identify the characteristics of “good” scientific language, and help create a consensus of which language improvements are acceptable (or necessary). Six participating teams took on the challenge.
copy
@InProceedings{daudaravicius-EtAl:2016:BEA11,
    author    = {Daudaravicius, Vidas  and  Banchs, Rafael E.  and  Volodina, Elena  and  Napoles, Courtney},
    title     = {A Report on the Automatic Evaluation of Scientific Writing Shared Task},
    booktitle = {Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications},
    month     = {June},
    year      = {2016},
    address   = {San Diego, CA},
    publisher = {Association for Computational Linguistics},
    pages     = {53--62},
    url       = {http://www.aclweb.org/anthology/W16-0506}
}
In this work, we estimate the deterioration of NLP processing given an estimate of the amount and nature of grammatical errors in a text. From a corpus of essays written by English-language learners, we extract ungrammatical sentences, controlling the number and types of errors in each sentence. We focus on six categories of errors that are commonly made by English-language learners, and consider sentences containing one or more of these errors. To evaluate the effect of grammatical errors, we measure the deterioration of ungrammatical dependency parses using the labeled F-score, an adaptation of the labeled attachment score. We find notable differences between the influence of individual error types on the dependency parse, as well as interactions between multiple errors.
copy
@InProceedings{napoles-cahill-madnani:2016:BEA11,
    author    = {Napoles, Courtney  and  Cahill, Aoife  and  Madnani, Nitin},
    title     = {The Effect of Multiple Grammatical Errors on Processing Non-Native Writing},
    booktitle = {Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications},
    month     = {June},
    year      = {2016},
    address   = {San Diego, CA},
    publisher = {Association for Computational Linguistics},
    pages     = {1--11},
    url       = {http://www.aclweb.org/anthology/W16-0501}
}
We present a simple, prepackaged solution to generating paraphrases of English sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting and provide machine translation language packs: prepackaged, tuned models that can be downloaded and used to generate paraphrases on a standard Unix environment. The language packs can be treated as a black box or customized to specific tasks. In this demonstration, we will explain how to use the included interactive web-based tool to generate sentential paraphrases.
copy
@InProceedings{napoles-callisonburch-post:2016:N16-3,
    author    = {Napoles, Courtney  and  Callison-Burch, Chris  and  Post, Matt},
    title     = {Sentential Paraphrasing as Black-Box Machine Translation},
    booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations},
    month     = {June},
    year      = {2016},
    address   = {San Diego, California},
    publisher = {Association for Computational Linguistics},
    pages     = {62--66},
    url       = {http://www.aclweb.org/anthology/N16-3013}
}
Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual simplifications with multiple references. Our work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task.
copy
@article{xu2016optimizing,
    author  = {Xu, Wei  and Napoles, Courtney  and Pavlick, Ellie  and Chen, Quanze  and Callison-Burch, Chris },
    title   = {Optimizing Statistical Machine Translation for Text Simplification},
    journal = {Transactions of the Association for Computational Linguistics},
    volume  = {4},
    year    = {2016},
    issn    = {2307-387X},
    url     = {https://transacl.org/ojs/index.php/tacl/article/view/741},
    pages   = {401--415}
}
How do we know which grammatical error correction (GEC) system is best? A number of metrics have been proposed over the years, each motivated by weaknesses of previous metrics; however, the metrics themselves have not been compared to an empirical gold standard grounded in human judgments. We conducted the first human evaluation of GEC system outputs, and show that the rankings produced by metrics such as MaxMatch and I-measure do not correlate well with this ground truth. As a step towards better metrics, we also propose GLEU, a simple variant of BLEU, modified to account for both the source and the reference, and show that it hews much more closely to human judgments.
copy
@InProceedings{napoles-EtAl:2015:ACL-IJCNLP,
    author    = {Napoles, Courtney  and  Sakaguchi, Keisuke  and  Post, Matt  and  Tetreault, Joel},
    title     = {Ground Truth for Grammatical Error Correction Metrics},
    booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Pape  },
    month     = {July},
    year      = {2015},
    address   = {Beijing, China},
    publisher = {Association for Computational Linguistics},
    pages     = {588--593},
    url       = {http://www.aclweb.org/anthology/P15-2097}
}
In this work, we explore applications of automatic essay scoring (AES) to a corpus of essays written by college freshmen and discuss the challenges we faced. While most AES systems evaluate highly constrained writing, we developed a system that handles open-ended, long-form writing. We present a novel corpus for this task, containing more than 3,000 essays and drafts written for a freshman writing course. We describe statistical analysis of the corpus and identify problems with automatically scoring this type of data. Finally, we demonstrate how to overcome grader bias by using a multi-task setup, and predict scores as well as human graders on a different dataset. Finally, we discuss how AES can help teachers assign more uniform grades.
copy
@InProceedings{napoles-callisonburch:2015:bea,
    author    = {Napoles, Courtney  and  Callison-Burch, Chris},
    title     = {Automatically Scoring Freshman Writing: A Preliminary Investigation},
    booktitle = {Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications},
    month     = {June},
    year      = {2015},
    address   = {Denver, Colorado},
    publisher = {Association for Computational Linguistics},
    pages     = {254--263},
    url       = {http://www.aclweb.org/anthology/W15-0629}
}
Simple Wikipedia has dominated simplification research in the past 5 years. In this opinion paper, we argue that focusing on Wikipedia limits simplification research. We back up our arguments with corpus analysis and by highlighting statements that other researchers have made in the simplification literature. We introduce a new simplification dataset that is a significant improvement over Simple Wikipedia, and present a novel quantitative-comparative approach to study the quality of simplification data resources.
copy
@article{xu2015problems,
    author  = {Xu, Wei  and Callison-Burch, Chris  and Napoles, Courtney },
    title   = {Problems in Current Text Simplification Research: New Data Can Help},
    journal = {Transactions of the Association for Computational Linguistics},
    volume  = {3},
    year    = {2015},
    issn    = {2307-387X},
    url     = {https://transacl.org/ojs/index.php/tacl/article/view/549},
    pages   = {283--297}
}
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scale work is based on inconsistent corpora which often have needed to be re-annotated by research teams independently, each time introducing biases that manifest as results that are only comparable at a high level. We provide to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution, along with an interface for programmatic access. Our goal is to enable broader involvement in large-scale knowledge-acquisition efforts by researchers that otherwise may not have had the ability to produce such a resource on their own.
copy
@InProceedings{napoles-gormley-vandurme:2012:AKBC-WEKEX,
    author    = {Napoles, Courtney  and  Gormley, Matthew  and  Van Durme, Benjamin},
    title     = {Annotated {G}igaword},
    booktitle = {Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)},
    month     = {June},
    year      = {2012},
    address   = {Montr{\'e}al, Canada},
    publisher = {Association for Computational Linguistics},
    pages     = {95--100},
    url       = {http://www.aclweb.org/anthology/W12-3018}
}
Previous work has shown that high quality phrasal paraphrases can be extracted from bilingual parallel corpora. However, it is not clear whether bitexts are an appropriate resource for extracting more sophisticated sentential paraphrases, which are more obviously learnable from monolingual parallel corpora. We extend bilingual paraphrase extraction to syntactic paraphrases and demonstrate its ability to learn a variety of general paraphrastic transformations, including passivization, dative shift, and topicalization. We discuss how our model can be adapted to many text generation tasks by augmenting its feature set, development data, and parameter estimation routine. We illustrate this adaptation by using our paraphrase model for the task of sentence compression and achieve results competitive with state-of-the-art compression systems.
copy
@InProceedings{ganitkevitch-EtAl:2011:EMNLP,
    author    = {Ganitkevitch, Juri  and  Callison-Burch, Chris  and  Napoles, Courtney  and  Van Durme, Benjamin},
    title     = {Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation},
    booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
    month     = {July},
    year      = {2011},
    address   = {Edinburgh, Scotland, UK.},
    publisher = {Association for Computational Linguistics},
    pages     = {1168--1179},
    url       = {http://www.aclweb.org/anthology/D11-1108}
}
This work surveys existing evaluation methodologies for the task of sentence compression, identifies their shortcomings, and proposes alternatives. In particular, we examine the problems of evaluating paraphrastic compression and comparing the output of different models. We demonstrate that compression rate is a strong predictor of compression quality and that perceived improvement over other models is often a side effect of producing longer output.
copy
@InProceedings{napoles-vandurme-callisonburch:2011:T2TW-2011,
    author    = {Napoles, Courtney  and  Van Durme, Benjamin  and  Callison-Burch, Chris},
    title     = {Evaluating Sentence Compression: Pitfalls and Suggested Remedies},
    booktitle = {Proceedings of the Workshop on Monolingual Text-To-Text Generation},
    month     = {June},
    year      = {2011},
    address   = {Portland, Oregon},
    publisher = {Association for Computational Linguistics},
    pages     = {91--97},
    url       = {http://www.aclweb.org/anthology/W11-1611}
}
We present a substitution-only approach to sentence compression which “tightens” a sentence by reducing its character length. Replacing phrases with shorter paraphrases yields paraphrastic compressions as short as 60% of the original length. In support of this task, we introduce a novel technique for re-ranking paraphrases extracted from bilingual corpora. At high compression rates, paraphrastic compressions outperform a state-of-the-art deletion model in an oracle experiment. For further compression, deleting from oracle paraphrastic compressions preserves more meaning than deletion alone. In either setting, paraphrastic compression shows promise for surpassing deletion-only methods.
copy
@InProceedings{napoles-EtAl:2011:T2TW-2011,
    author    = {Napoles, Courtney  and  Callison-Burch, Chris  and  Ganitkevitch, Juri  and  Van Durme, Benjamin},
    title     = {Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion},
    booktitle = {Proceedings of the Workshop on Monolingual Text-To-Text Generation},
    month     = {June},
    year      = {2011},
    address   = {Portland, Oregon},
    publisher = {Association for Computational Linguistics},
    pages     = {84--90},
    url       = {http://www.aclweb.org/anthology/W11-1610}
}
Text simplification is the process of changing vocabulary and grammatical structure to create a more accessible version of the text while maintaining the underlying information and content. Automated tools for text simplification are a practical way to make large corpora of text accessible to a wider audience lacking high levels of fluency in the corpus language. In this work, we investigate the potential of Simple Wikipedia to assist automatic text simplification by building a statistical classification system that discriminates simple English from ordinary English. Most text simplification systems are based on hand-written rules (e.g., PEST (Carroll et al., 1999) and its module SYSTAR (Canning et al., 2000)), and therefore face limitations scaling and transferring across domains. The potential for using Simple Wikipedia for text simplification is significant; it contains nearly 60,000 articles with revision histories and aligned articles to ordinary English Wikipedia. Using articles from Simple Wikipedia and ordinary Wikipedia, we evaluated different classifiers and feature sets to identify the most discriminative features of simple English for use across domains. These findings help further understanding of what makes text simple and can be applied as a tool to help writers craft simple text.
copy
@InProceedings{napoles-dredze:2010:CLW,
    author    = {Napoles, Courtney  and  Dredze, Mark},
    title     = {Learning {Simple Wikipedia}: A Cogitation in Ascertaining Abecedarian Language},
    booktitle = {Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids},
    month     = {June},
    year      = {2010},
    address   = {Los Angeles, CA, USA},
    publisher = {Association for Computational Linguistics},
    pages     = {42--50},
    url       = {http://www.aclweb.org/anthology/W10-0406}
}