nlp - delete stop words in a file in python -


i have file consists of stop words (each in new line) , file (a corpus actually) consists of lot of sentences each in new line. have delete stop words in corpus , return each line of without stop words. wrote code returns 1 sentence. (the language persian). how can fix it returns of sentences?

with open ("stopwords.txt", encoding = "utf-8") f1:    open ("train.txt", encoding = "utf-8") f2:       in f1:           line in f2:               if in line:                  line= line.replace(i, "") open ("nostopwordstrain.txt", "w", encoding = "utf-8") f3:    f3.write (line) 

the problem last 2 lines of code not in loop. iterating through entire f2, line-by-line, , doing nothing it. then, after last line, write last line f3. instead, try:

with open("stopwords.txt", encoding = "utf-8") stopfile:     stopwords = stopfile.readlines() # make convenient list     print stopwords # check words open("train.txt", encoding = "utf-8") trainfile:     open ("nostopwordstrain.txt", "w", encoding = "utf-8") newfile:         line in trainfile: # go through each line             word in stopwords: # go through , replace each word                 line= line.replace(word, "")             newfile.write (line) 

Comments