python讀取文本，python 讀取文本文件_如何在Python中讀取大文本文件

2023-11-19 阅读 23 评论 0

摘要：python 讀取文本文件python讀取文本？Python File object provides various ways to read a text file. The popular way is to use the readlines() method that returns a list of all the lines in the file. However, it’s not suitable to read a large text file becau

python 讀取文本文件

python讀取文本？Python File object provides various ways to read a text file. The popular way is to use the readlines() method that returns a list of all the lines in the file. However, it’s not suitable to read a large text file because the whole file content will be loaded into the memory.

Python File對象提供了多種讀取文本文件的方法。流行的方法是使用readlines（）方法，該方法返回文件中所有行的列表。但是，不適合讀取大文本文件，因為整個文件內容都將加載到內存中。

用Python讀取大文本文件 (Reading Large Text Files in Python)

python讀取文件到列表、We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it’s suitable to read large files in Python.

我們可以使用文件對象作為迭代器。迭代器將逐行返回每一行，可以對其進行處理。這不會將整個文件讀入內存，并且適合用Python讀取大文件。

Here is the code snippet to read large file in Python by treating it as an iterator.

這是通過將Python視為迭代器來讀取大型文件的代碼段。

import resource
import osfile_name = "/Users/pankaj/abcdef.txt"print(f'File Size is {os.stat(file_name).st_size / (1024 * 1024)} MB')txt_file = open(file_name)count = 0for line in txt_file:# we can process file line by line here, for simplicity I am taking count of linescount += 1txt_file.close()print(f'Number of Lines in the file is {count}')print('Peak Memory Usage =', resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
print('User Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_utime)
print('System Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_stime)

When we run this program, the output produced is:

當我們運行該程序時，產生的輸出為：

File Size is 257.4920654296875 MB
Number of Lines in the file is 60000000
Peak Memory Usage = 5840896
User Mode Time = 11.46692
System Mode Time = 0.09655899999999999

Python Read Large Text File

Python讀取大文本文件

I am using os module to print the size of the file.
我正在使用os模塊來打印文件的大小。
The resource module is used to check the memory and CPU time usage of the program.
資源模塊用于檢查程序的內存和CPU時間使用情況。

We can also use with statement to open the file. In this case, we don’t have to explicitly close the file object.

我們還可以使用with語句打開文件。在這種情況下，我們不必顯式關閉文件對象。

with open(file_name) as txt_file:for line in txt_file:# process the linepass

如果大文件沒有行怎么辦？ (What if the Large File doesn’t have lines?)

The above code will work great when the large file content is divided into many lines. But, if there is a large amount of data in a single line then it will use a lot of memory. In that case, we can read the file content into a buffer and process it.

當將大文件內容分為多行時，以上代碼將非常有用。但是，如果一行中有大量數據，那么它將占用大量內存。在這種情況下，我們可以將文件內容讀入緩沖區并進行處理。

with open(file_name) as f:while True:data = f.read(1024)if not data:breakprint(data)

The above code will read file data into a buffer of 1024 bytes. Then we are printing it to the console.

上面的代碼會將文件數據讀取到1024字節的緩沖區中。然后我們將其打印到控制臺。

When the whole file is read, the data will become empty and the break statement will terminate the while loop.

當讀取整個文件時，數據將變為空，并且break語句將終止while循環。

This method is also useful in reading a binary file such as images, PDF, word documents, etc.

此方法在讀取二進制文件（例如圖像，PDF，Word文檔等）時也很有用。

Here is a simple code snippet to make a copy of the file.

這是制作文件副本的簡單代碼段。

with open(destination_file_name, 'w') as out_file:with open(source_file_name) as in_file:for line in in_file:out_file.write(line)

Reference: StackOverflow Question

參考： StackOverflow問題

翻譯自: https://www.journaldev.com/32059/read-large-text-files-in-python

python 讀取文本文件

原文链接：https://hbdhgg.com/1/183231.html

上一篇：java arraylist sort，Java集合sort（）

下一篇：jdk11.0.12，jms.jar 2.0_JMS API概述：JMS 1.x和JMS 2.x

标签：python讀取文本 python讀取文件到列表 python讀文本 python文本替換 python打開文本 python解析文件 Python讀取csv文件 python編程

python讀取文本

发表评论: