Python Forum
How to summarize an article that is stored in a word document on your laptop?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to summarize an article that is stored in a word document on your laptop?
#1
So I am new here.... I wrote a code in pycharm that summarizes articles online. This is the code below: it works fine. what about if I want to summarize an article that is stored in a word document on my laptop? can somebody help me with the code? Again I am using Anaconda prompt and pycharm

import tkinter as tk import nltk from textblob import TextBlob from newspaper import Article url = "https://www.news.com/index.html" article = Article(url) article.download() article.parse() article.nlp() print(f'Title: {article.title}') print(f'Authors: {article.authors}') print(f'Publication Date: {article.publish_date}') print(f'Summary: {article.summary}')
Gribouillis write Oct-06-2023, 03:42 AM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Reply
#2
Word documents have metadata information. You can access that, if that is what you are looking for.

I think a word document will only have a title, author name, etc. if the author actually puts that data in the document metadata.

For personal stuff, I don't think many people will do that.

Maybe the publish date and modified date are recorded automatically.

I copied this from stackoverflow

# if you don't have it, first install python-docx module: pip3 install python-docx import docx path2file = "/home/pedro/myStuff/mydocument1.docx" def getMetaData(doc): metadata = {} prop = doc.core_properties metadata["author"] = prop.author metadata["category"] = prop.category metadata["comments"] = prop.comments metadata["content_status"] = prop.content_status metadata["created"] = prop.created metadata["identifier"] = prop.identifier metadata["keywords"] = prop.keywords metadata["last_modified_by"] = prop.last_modified_by metadata["language"] = prop.language metadata["modified"] = prop.modified metadata["subject"] = prop.subject metadata["title"] = prop.title metadata["version"] = prop.version return metadata doc = docx.Document(path2file) metadata_dict = getMetaData(doc) for item in metadata_dict.items(): print(item)
Sometimes I want to get the text from .docx files. I never needed the metadata!
Mikedicenso87 likes this post
Reply
#3
This code basically pulls just high level information. I will try to write a new code and will post it it when done.. Thanks so much Pedro!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Problem: Check if a list contains a word and then continue with the next word Mangono 2 4,558 Aug-12-2021, 04:25 PM
Last Post: palladium
  How to read check boxes from word document srikanthpython 0 4,016 Mar-30-2021, 01:58 PM
Last Post: srikanthpython
  Python script to summarize excel tables, then output a composite table? i'm a total n surfer349 1 4,106 Feb-05-2021, 04:37 PM
Last Post: nilamo
  I can`t find an IDE functioning in my laptop All_ex_Under 5 4,968 Aug-17-2020, 05:44 AM
Last Post: All_ex_Under
  Python Speech recognition, word by word AceScottie 6 20,381 Apr-12-2020, 09:50 AM
Last Post: vinayakdhage
  Homepage Article Grid JedBoyle 1 27,127 Feb-20-2020, 12:01 AM
Last Post: Larz60+
  print a word after specific word search evilcode1 8 7,765 Oct-22-2019, 08:08 AM
Last Post: newbieAuggie2019
  How to transfer Text from one Word Document to anouther konsular 11 7,473 Oct-09-2019, 07:00 PM
Last Post: buran
  How to detect wireless modem connected serially to my laptop in python barry76 3 5,370 Jan-08-2019, 06:18 AM
Last Post: Gribouillis
  Can python be used to search a word document for combinations of 6 digits? gkirt1053 2 3,989 Nov-15-2018, 06:22 PM
Last Post: gkirt1053

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.