如何通过luasocket和设置proxy来抓取指定URL页面内容?
- 内容介绍
- 文章标签
- 相关推荐
本文共计315个文字,预计阅读时间需要2分钟。
python使用local socket连接到远程服务器并打印结果:
import socket
创建socket连接with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client: # 设置代理 proxy=my_proxy_and_port_here client.setproxy(socket.PROXY_TYPE_HTTP, proxy)
# 发起请求 client.request(GET, /)
# 接收并打印响应内容 for i, v in enumerate(client.recv(1024).decode()): print(i, v)
到目前为止,我有以下内容:local socket = require "socket.example.com/", proxy="<my proxy and port here>"} for i,v in pairs( c ) do print( i, v ) end
这给了我一个如下输出:
connection close content-type text/html; charset=UTF-8 location www.iana.org/domains/example/ vary Accept-Encoding date Tue, 24 Apr 2012 21:43:19 GMT last-modified Wed, 09 Feb 2011 17:13:15 GMT transfer-encoding chunked server Apache/2.2.3 (CentOS)
这意味着连接建立得非常完美.现在,我想使用这个socket.example.com/", sink = ltn12.sink.table(result_table), proxy="<my proxy and port here>" } -- Join the chunks together into a string: local result = table.concat(result_table); -- Hacky solution to extract the title: local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<"); print(title);
如果您的代理在整个应用程序中保持不变,那么更直接的解决方案是使用www.youtube.com/watch?v=_eT40eV7OiI") local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<"); print(title);
输出:
Flanders and Swann - A song of the weather - YouTube
本文共计315个文字,预计阅读时间需要2分钟。
python使用local socket连接到远程服务器并打印结果:
import socket
创建socket连接with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client: # 设置代理 proxy=my_proxy_and_port_here client.setproxy(socket.PROXY_TYPE_HTTP, proxy)
# 发起请求 client.request(GET, /)
# 接收并打印响应内容 for i, v in enumerate(client.recv(1024).decode()): print(i, v)
到目前为止,我有以下内容:local socket = require "socket.example.com/", proxy="<my proxy and port here>"} for i,v in pairs( c ) do print( i, v ) end
这给了我一个如下输出:
connection close content-type text/html; charset=UTF-8 location www.iana.org/domains/example/ vary Accept-Encoding date Tue, 24 Apr 2012 21:43:19 GMT last-modified Wed, 09 Feb 2011 17:13:15 GMT transfer-encoding chunked server Apache/2.2.3 (CentOS)
这意味着连接建立得非常完美.现在,我想使用这个socket.example.com/", sink = ltn12.sink.table(result_table), proxy="<my proxy and port here>" } -- Join the chunks together into a string: local result = table.concat(result_table); -- Hacky solution to extract the title: local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<"); print(title);
如果您的代理在整个应用程序中保持不变,那么更直接的解决方案是使用www.youtube.com/watch?v=_eT40eV7OiI") local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<"); print(title);
输出:
Flanders and Swann - A song of the weather - YouTube

