You want an incremental json parser like yajl and one of its python bindings. An incremental parser reads as little as possible from the input and invokes a callback when something meaningful is decoded. For example, to pull only numbers from a big json file:
class ContentHandler(YajlContentHandler):
def yajl_number(self, ctx, val):
list_of_numbers.append(float(val))
parser = YajlParser(ContentHandler())
parser.parse(some_file)
See http://pykler.github.com/yajl-py/ for more info.