Package 'antiword'

Title: Extract Text from Microsoft Word Documents
Description: Wraps the 'AntiWord' utility to extract text from Microsoft Word documents. The utility only supports the old 'doc' format, not the new xml based 'docx' format. Use the 'xml2' package to read the latter.
Authors: Jeroen Ooms [aut, cre] , Adri van Os [cph] (Author 'antiword' utility)
Maintainer: Jeroen Ooms <[email protected]>
License: GPL-2
Version: 1.3.4
Built: 2024-11-02 05:42:49 UTC
Source: https://github.com/ropensci/antiword

Help Index


Antiword

Description

Wraps the antiword utility. Takes a path to an word file and returns text from the document.

Usage

antiword(file = NULL, format = FALSE)

Arguments

file

path or url to your word file

format

format the output text (-f parameter)

Examples

text <- antiword("https://jeroen.github.io/files/UDHR-english.doc")
cat(text)