Файл:New framework for cross-domain document classification (IA newframeworkforc1094510786).pdf

Размер этого JPG-превью для исходного PDF-файла: 463 × 599 пкс. Другие разрешения: 185 × 240 пкс | 371 × 480 пкс | 593 × 768 пкс | 1275 × 1650 пкс.

Исходный файл (1275 × 1650 пкс. Размер файла: 4,67 МБ, MIME-тип: application/pdf. 178 страниц)

Этот файл из на Викискладе и может использоваться в других проектах. Информация с его страницы описания приведена ниже.

Краткое описание

New framework for cross-domain document classification ( )
Автор	Gupta, Anjum
Название	New framework for cross-domain document classification
Издательство	Monterey, California. Naval Postgraduate School
Описание	Automatic text document classification is a fundamental problem in machine learning. Given the dynamic nature and the exponential growth of the World Wide Web, one needs the ability to classify not only a massive number of documents, but also documents that belong to wide variety of domains. Some examples of the domains are e-mails, blogs, Wikipedia articles, news articles, newsgroups, online chats, etc. It is the difference in the writing style that differentiates these domains. Text documents are usually classified using supervised learning algorithms that require large set of pre-labeled data. This requirement, of labeled data, poses a challenge in classifying documents that belong to different domains. Our goal is to classify text documents in the testing domain without requiring any labeled documents from the same domain. Our research develops specialized cross-domain learning algorithms based the distributions over words obtained from a collection of text documents by topic models such as Latent Dirichlet Allocation (LDA). Our major contributions include (1) empirically showing that conventional supervised learning algorithms fail to generalize their learned models across different domains and (2) development of novel and specialized cross-domain classification algorithms that show an appreciable improvement over conventional methods used for cross-domain classification that is consistent for different datasets. Our research addresses many real-world needs. Since massive number of new types of text documents is generated daily, it is crucial to have the ability to transfer learned information from one domain to another domain. Cross-domain classification lets us leverage information learned from one domain for use in the classification of documents in a new domain. Subjects: Machine learning; Cross Domain Classification; Text Mining; Machine Learning; Genre Shift; Document Classification
Язык	английский
Дата публикации	март 2011
Текущее местонахождение	IA Collections: navalpostgraduateschoollibrary; fedlink
Инвентарный номер	newframeworkforc1094510786
Источник	Internet Archive identifier: newframeworkforc1094510786 https://archive.org/download/newframeworkforc1094510786/newframeworkforc1094510786.pdf
Права (Повторное использование этого файла)	This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. As such, it is in the public domain, and under the provisions of Title 17, United States Code, Section 105, may not be copyrighted.

Лицензирование

	Это произведение находится в общественном достоянии (англ. public domain) в Соединённых Штатах Америки, так как оно является работой, выполненной должностным лицом или наёмным сотрудником Федерального правительства США в качестве части служебных обязанностей этого лица. Правовой статус регламентируется в соответствии с разделом 17, главой 1, секцией 105 Кодекса Соединённых Штатов. См. Авторское право. Обратите внимание: это относится только к оригинальным (первоначальным) произведениям Федерального правительства, а не к произведениям любого отдельного штата США, территории, содружества, округа, муниципалитета или любой другой территориальной единицы. Этот шаблон также не относится к дизайну почтовых марок, изданных Почтовой службой США с 1978 года. (См. § 313.6(C)(1) в Compendium of U.S. Copyright Office Practices). Это также не относится к определённым монетам США; см. Условия использования Монетного двора США.
Этот файл был определён как свободный от известных ограничений авторского права, а также связанных и смежных прав.

PDMCreative Commons Public Domain Mark 1.0falsefalse

История файла

Нажмите на дату/время, чтобы увидеть версию файла от того времени.

	Дата/время	Миниатюра	Размеры	Участник	Примечание
текущий	10:23, 23 июля 2020		1275 × 1650, 178 страниц (4,67 МБ)	Fæ	FEDLINK - United States Federal Collection newframeworkforc1094510786 (User talk:Fæ/IA books#Fork8) (batch 1993-2020 #22999)

Использование файла

Нет страниц, использующих этот файл.

Метаданные

Файл содержит дополнительные данные, обычно добавляемые цифровыми камерами или сканерами. Если файл после создания редактировался, то некоторые параметры могут не соответствовать текущему изображению.

Краткое название	New framework for cross-domain document classification
Автор	Gupta, Anjum
Программное обеспечение	Gupta, Anjum
Программа преобразования	MiKTeX pdfTeX-1.40.10
Шифрование	no
Размер страницы	612 x 792 pts (letter)
Версия в формате PDF	1.4

Файл:New framework for cross-domain document classification (IA newframeworkforc1094510786).pdf

Краткое описание

Лицензирование

Краткие подписи

Элементы, изображённые на этом файле

изображённый объект

правовой статус

общественное достояние

История файла

Использование файла

Метаданные