Python 数据结构内存占用是原始数据的 5 倍左右

資深大佬 : wohenfuyou 0

大概是如下结构体
[{dic1….},{key1:[value1,value2…],key2:[value1,value2…]},{dic3},{dic4}]

1.5g 数据要 7g 多内存，请问这是正常的内存占用么，要是优化的话在哪方面优化呢，请教各位~

大佬有話說 (12)

資深大佬 : linksNoFound

买内存

資深大佬 : lkytal

不用 Python (雾

資深大佬 : est

一般是用 __slots__ 缓解。不过你遇到内存问题建议 numpy 。

資深大佬 : iqxd

大量小数据如果纠结内存就不要用通用数据结构。
为了读取和插入效率，python 的 dict 的默认分配键值空间与实际键值数量大概是 3:1.
对于 python 还算正常，即使是 C 的双向链表，用 64 位指针有时候占用内存都可能是实际数据的 3 倍

資深大佬 : ebingtel

纯粹遍历的话用生成器别用 list

資深大佬 : gjquoiai

pandas 解君愁

資深大佬 : tmackan

py 也有基础类型的，除了 list，tuple，dict 以后，也支持特定类型的类型，可以看下，比如只支持 int 的 list

資深大佬 : tmackan

@tmackan 纠正下 from array import array 这个数组，只能指定特定类型的数组

A Python array is as lean as a C array. When creating an array, you provide a typecode, a letter to determine the underlying C type used to store each item in the array. For example, b is the typecode for signed char. If you create an array(‘b’), then each item will be stored in a single byte and interpreted as an integer from –128 to 127. For large sequences of numbers, this saves a lot of memory. And Python will not let you put any number that does not match the type for the array.

資深大佬 : vance123

虽然 Python 提供了比 C 更多的功能, 但这是以时间和空间为代价的. 以 str 为例, 单个对象起步就是几十 Byte 的 metadata, 字符串中只要有一个中文字符, 其内存占用都会比纯 ascii 多上 3 倍.

資深大佬 : lysS

py 又不是 C，是这样的，我记得 PHP 的数组占用大小也是实际大小的 4 倍左右

資深大佬 : littleMaple

感觉是个 XY 问题，题主如果可以给出更多上下文，可以更好地解决。

如果是 homogeneous data structure，可以用上说的 numpy，pandas，array 等库。

另外可以考虑转换成数据库数据，需要访问数据时再和数据库进行交互，就不必把所有数据都一下子挤进内存中。顺带一提 Python 有原生自带数据库交互库叫做 sqlite3.

看你列出的数据的格式，似乎像是 JSON，可以用一些 lazy-parse json 的库，例如 ijson ( https://pypi.org/project/ijson/), json-streamer ( https://github.com/kashifrazzaqui/json-streamer), bigjson ( https://github.com/henu/bigjson), jsonslicer ( https://pypi.org/project/jsonslicer/).

如果对精度要求不高，可以用一些 succinct data structure 来大大简化数据结构空间开销，例如布隆过滤器，HyperLogLog，Count-Min sketch 等。

关于数据空间占用的优化，可以看经典的《 Python 高性能编程》，里面除了讲 Python 代码如何提升时间性能，还讲了 Python 代码如何提升空间性能，讲的非常全面细致，大部分可用的选项都覆盖了.

資深大佬 : hotea

数据有重复值的话 categorical 可以省一点内存
https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html