proxy-pool-note

Proxy Pool 笔记 #

使用 requests #

基本用法 #

python
imports requests

# Send a GET request
requests.get(url, params=None, **kwargs)
# params: Dictionary or bytes to be sent in the query string
# return: <Response> object

requests.post(...)
requests.put(...)
requests.delete(...)
requests.head(...)
requests.options(...)

# Send a request
requests.request(method, url, **kwargs)

请求参数 #

**kwargs :

  • headers : 请求头 dict

  • cookies : dict or CookieJar object

  • auth : Auth tuple to enable Basic/Digest/Custom HTTP Auth.

  • data : 请求体,dict or list [(key, value)] 当作表单提交,bytes, file-like object

  • json : json请求体,dict or str

  • files : 文件表单

    (1) list of file-tuple:
    [(name, fileobj),
    (name, fileobj, content_type),
    (name, fileobj, content_type, custom_headers)]
    
    Example:
    [('field1', open('filepath1', 'rb')),
    ('field2', open('filepath2', 'rb').read())
    ('field3', ('filename3', open('filepath3', 'rb'))),
    ('field4', ('filename4', open('filepath4', 'rb'), 'image/jpg'))]
    
    (2) dict of files:
    {name: file-like-objects or file-tuple}
    
    Example:
    {'field1': open('filepath1', 'rb'),
     'field2': ('filename2', open('filepath2', 'rb'), 'image/jpg')}
  • timeout : 请求超时,秒,float or tuple (connect_timeout, read_timeout)

  • allow_redirects : 是否允许重定向,默认 True

  • proxies : dict {protocol: address}

  • verify : 是否校验服务器TLS证书,默认 True

  • stream : 若 False,则立即下载响应内容

  • cert : If str, cert file path (.pem). If tuple, (‘cert’, ‘key’) pair.

响应结果 #

requests.models.Response object as r:

  • r.url : 请求URL
  • r.ok : 响应状态 bool
  • r.status_code : 响应状态码
  • r.headers : 响应头 dict,不区分大小写,若键不存在则返回None
  • r.encoding : 获取字符编码
  • r.encoding = 'utf-8' : 设置字符编码
  • r.content : 字节内容
  • r.text : 字符串内容
  • r.raw : urllib3.response.HTTPResponse 对象,常用 read()
  • r.json() : 用 requests 内置 JSON 解码器解码,若格式错误则抛出异常
  • r.raise_for_status() : 若非200响应状态码则抛出异常

使用 json #

dumps/dump #

python
import json

# Serialize obj to str
json.dumps(obj, ... , **kwargs)
# return: str of JSON

# Serialize obj and write to a file-like object
json.dump(obj, fp, ... , **kwargs)
# fp: a file-like object supporting write string but not bytes

**kwargs :

  • skipkeys : If True, types not basic (str, int, float, bool, None) will be skipped instead of raising a TypeError. Default is False.
  • ensure_ascii : If False, the return value can contain non-ASCII characters instead of escaped. Default is True.
  • check_circular : If False, a circular reference will result in an OverflowError or worse. Default is True.
  • allow_nan : If False, out of range float values (nan, inf, -inf) will result in a ValueError instead of JavaScript equivalents (NaN, Infinity, -Infinity). Default is True.
  • indent : 缩进,int
  • separators : tuple (item_separator, key_separator), default is (', ', ': ') if indent is None and (',', ': ') otherwise.
  • sort_key: If True, the output will be sorted by key. Default is False.
  • default : A function (called with obj, should return a serializable version of obj or raise TypeError). The default simply raises TypeError.
  • cls : Use a custom JSONEncoder subclass instead of JSONEncoder class.

loads/load #

python
import json

# Deserialize str to obj
json.loads(s, ... , **kwargs)
# return: obj

# Deserialize from a file-like object
json.load(fp, ... , **kwargs)
# fp: a file-like object supporting read string or bytes

**kwargs :

  • object_hook : A function will be called with the result of any object literal decode (a dict). The return value will be used instead of the dict.
  • object_pairs_hook : A function will be called with the result of an object literal decoded with an ordered list of pairs. The return value will be used instead of the dict.
  • parse_float : A function will be called with the string of every JSON float to be decoded. The default is equivalent to float(num_str).
  • parse_int : A function will be called with the string of every JSON int to be decoded. The default is equivalent to int(num_str).
  • parse_constant : A function will be called with one of following strings: -Infinity, Infinity, NaN.
  • cls : Use a custom JSONDecoder subclass instead of JSONDecoder class.

示例 #

python
# Parse string from a file
with open('filepath', 'r') as f:
    d = json.load(f)

# Parse bytes (using 'utf-8' only) from a file
with open('filepath', 'rb') as f:
    d = json.load(f)

# Parse string
d = json.loads('{"a": "1"}')

# Parse bytes (using 'utf-8' only)
d = json.loads(b'{"a": "1"}')

# Encode
s = json.dumps({'a': '1'})

# Save to a file
with open('filepath', 'w', encoding='gbk') as f:
    json.dump({'a': '啊'}, f, ensure_ascii=False)

# Save to a file, but raise an error.
# Because it always calls `.write()` with a string not bytes.
'''
with open('filepath', 'wb') as f:
    json.dump({'a': '1'}, f)
'''
2020年2月16日