python 讀取文本文件
python讀取文本?Python File object provides various ways to read a text file. The popular way is to use the readlines() method that returns a list of all the lines in the file. However, it’s not suitable to read a large text file because the whole file content will be loaded into the memory.
Python File對象提供了多種讀取文本文件的方法。 流行的方法是使用readlines()方法,該方法返回文件中所有行的列表。 但是,不適合讀取大文本文件,因為整個文件內容都將加載到內存中。
python讀取文件到列表、We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it’s suitable to read large files in Python.
我們可以使用文件對象作為迭代器。 迭代器將逐行返回每一行,可以對其進行處理。 這不會將整個文件讀入內存,并且適合用Python讀取大文件。
Here is the code snippet to read large file in Python by treating it as an iterator.
這是通過將Python視為迭代器來讀取大型文件的代碼段。
import resource
import osfile_name = "/Users/pankaj/abcdef.txt"print(f'File Size is {os.stat(file_name).st_size / (1024 * 1024)} MB')txt_file = open(file_name)count = 0for line in txt_file:# we can process file line by line here, for simplicity I am taking count of linescount += 1txt_file.close()print(f'Number of Lines in the file is {count}')print('Peak Memory Usage =', resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
print('User Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_utime)
print('System Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_stime)
When we run this program, the output produced is:
當我們運行該程序時,產生的輸出為:
File Size is 257.4920654296875 MB
Number of Lines in the file is 60000000
Peak Memory Usage = 5840896
User Mode Time = 11.46692
System Mode Time = 0.09655899999999999
We can also use with statement to open the file. In this case, we don’t have to explicitly close the file object.
我們還可以使用with語句打開文件。 在這種情況下,我們不必顯式關閉文件對象。
with open(file_name) as txt_file:for line in txt_file:# process the linepass
The above code will work great when the large file content is divided into many lines. But, if there is a large amount of data in a single line then it will use a lot of memory. In that case, we can read the file content into a buffer and process it.
當將大文件內容分為多行時,以上代碼將非常有用。 但是,如果一行中有大量數據,那么它將占用大量內存。 在這種情況下,我們可以將文件內容讀入緩沖區并進行處理。
with open(file_name) as f:while True:data = f.read(1024)if not data:breakprint(data)
The above code will read file data into a buffer of 1024 bytes. Then we are printing it to the console.
上面的代碼會將文件數據讀取到1024字節的緩沖區中。 然后我們將其打印到控制臺。
When the whole file is read, the data will become empty and the break statement will terminate the while loop.
當讀取整個文件時,數據將變為空,并且break語句將終止while循環。
This method is also useful in reading a binary file such as images, PDF, word documents, etc.
此方法在讀取二進制文件(例如圖像,PDF,Word文檔等)時也很有用。
Here is a simple code snippet to make a copy of the file.
這是制作文件副本的簡單代碼段。
with open(destination_file_name, 'w') as out_file:with open(source_file_name) as in_file:for line in in_file:out_file.write(line)
Reference: StackOverflow Question
參考 : StackOverflow問題
翻譯自: https://www.journaldev.com/32059/read-large-text-files-in-python
python 讀取文本文件
版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。
工作时间:8:00-18:00
客服电话
电子邮件
admin@qq.com
扫码二维码
获取最新动态