Developer interface¶
These are the interfaces that PyPAC exposes to developers.
Main interface¶
These are the most commonly used components of PyPAC.
- class pypac.PACSession(pac=None, proxy_auth=None, pac_enabled=True, response_proxy_fail_filter=None, exception_proxy_fail_filter=None, socks_scheme='socks5', **kwargs)[source]¶
A PAC-aware Requests Session that discovers and complies with a PAC file, without any configuration necessary. PAC file discovery is accomplished via the Windows Registry (if applicable), and the Web Proxy Auto-Discovery (WPAD) protocol. Alternatively, a PAC file may be provided in the constructor.
- Parameters:
pac (PACFile) – The PAC file to consult for proxy configuration info. If not provided, then upon the first request,
get_pac()is called with default arguments in order to find a PAC file.proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
pac_enabled (bool) – Set to
Falseto disable all PAC functionality, including PAC auto-discovery.response_proxy_fail_filter – Callable that takes a
requests.Responseand returns a boolean for whether the response means the proxy used for the request should no longer be used. By default, the response is not inspected.exception_proxy_fail_filter – Callable that takes an exception and returns a boolean for whether the exception means the proxy used for the request should no longer be used. By default,
requests.exceptions.ConnectTimeoutandrequests.exceptions.ProxyErrorare matched.socks_scheme (str) – Scheme to use when PAC file returns a SOCKS proxy. socks5 by default.
- do_proxy_failover(proxy_url, for_url)[source]¶
- Parameters:
- Returns:
The next proxy config to try, or ‘DIRECT’.
- Raises:
ProxyConfigExhaustedError – If the PAC file provided no usable proxy configuration.
- get_pac(**kwargs)[source]¶
Search for, download, and parse PAC file if it hasn’t already been done. This method is called upon the first use of
request(), but can also be called manually beforehand if desired. Subsequent calls to this method will only return the obtained PAC file, if any.- Returns:
The obtained PAC file, if any.
- Return type:
PACFile|None
- Raises:
MalformedPacError – If something that claims to be a PAC file was downloaded but could not be parsed.
- pac_enabled¶
Set to
Falseto disable all PAC functionality, including PAC auto-discovery.
- property proxy_auth¶
Proxy authentication object.
- request(method, url, proxies=None, **kwargs)[source]¶
- Raises:
ProxyConfigExhaustedError – If the PAC file provided no usable proxy configuration.
MalformedPacError – If something that claims to be a PAC file was downloaded but could not be parsed.
- pypac.get_pac(url=None, js=None, from_os_settings=True, from_dns=True, timeout=2, allowed_content_types=None, session=None, **kwargs)[source]¶
Convenience function for finding and getting a parsed PAC file (if any) that’s ready to use.
- Parameters:
url (str) – Download PAC from a URL. If provided, from_os_settings and from_dns are ignored.
js (str) – Parse the given string as a PAC file. If provided, from_os_settings and from_dns are ignored.
from_os_settings (bool) – Look for a PAC URL or filesystem path from the OS settings, and use it if present. Doesn’t do anything on non-Windows or non-macOS/OSX platforms.
from_dns (bool) – Look for a PAC file using the WPAD protocol.
timeout – Time to wait for host resolution and response for each URL.
allowed_content_types – If the response has a
Content-Typeheader, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types areapplication/x-ns-proxy-autoconfigandapplication/x-javascript-config.session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
- Returns:
The first valid parsed PAC file according to the criteria, or None if nothing was found.
- Return type:
PACFile|None
- Raises:
MalformedPacError – If something that claims to be a PAC file was obtained but could not be parsed.
- pypac.collect_pac_urls(from_os_settings=True, from_dns=True, **kwargs)[source]¶
Get all the URLs that potentially yield a PAC file.
- Parameters:
- Returns:
A list of URLs that should be tried in order.
- Return type:
- pypac.download_pac(candidate_urls, timeout=1, allowed_content_types=None, session=None)[source]¶
Try to download a PAC file from one of the given candidate URLs.
- Parameters:
candidate_urls (list[str]) – URLs that are expected to return a PAC file. Requests are made in order, one by one.
timeout – Time to wait for host resolution and response for each URL. When a timeout or DNS failure occurs, the next candidate URL is tried.
allowed_content_types – If the response has a
Content-Typeheader, then consider the response to be a PAC file only if the header is one of these values. If not specified, the allowed types areapplication/x-ns-proxy-autoconfigandapplication/x-javascript-config.session (requests.Session) – Used for getting potential PAC files. If not specified, a generic session is used.
- Returns:
Contents of the PAC file, or None if no URL was successful.
- Return type:
str|None
- pypac.pac_context_for_url(url, proxy_auth=None, pac=None)[source]¶
This context manager provides a simple way to add rudimentary PAC functionality to code that cannot be modified to use
PACSession, but obeys theHTTP_PROXYandHTTPS_PROXYenvironment variables.Upon entering this context, PAC discovery occurs with default parameters. If a PAC is found, then it’s asked for the proxy to use for the given URL. The proxy environment variables are then set accordingly.
Note that this provides a very simplified PAC experience that’s insufficient for some scenarios.
- Parameters:
url – Consult the PAC for the proxy to use for this URL.
proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication.
pac (PACFile) – The PAC to use to resolve the proxy. If not provided,
get_pac()is called with default arguments in order to find a PAC file.
PAC parsing and execution¶
Functions and classes for parsing and executing PAC files.
- class pypac.parser.PACFile(pac_js, **kwargs)[source]¶
Represents a PAC file.
JavaScript parsing and execution is handled by the dukpy library.
Load a PAC file from a given string of JavaScript. Errors during parsing and validation may raise a specialized exception.
- Parameters:
pac_js (str) – JavaScript that defines the
FindProxyForURL()orFindProxyForURLEx()function.- Raises:
MalformedPacError – If the JavaScript could not be parsed, does not define the expected function, or is otherwise invalid.
- pypac.parser.parse_pac_value(value, socks_scheme=None)[source]¶
Parse the return value of
FindProxyForURL()into a list. List elements will either be the string “DIRECT” or a proxy URL.For example, the result of parsing
PROXY example.local:8080; DIRECTis a list containing stringshttp://example.local:8080andDIRECT.
- pypac.parser.proxy_url(value, socks_scheme=None)[source]¶
Parse a single proxy config value from FindProxyForURL() into a more usable element.
The recognized keywords are
DIRECT,PROXY,SOCKS,SOCKS4, andSOCKS5. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Proxy_servers_and_tunneling/Proxy_Auto-Configuration_PAC_file#return_value_format- Parameters:
- Returns:
Parsed value, e.g.:
DIRECT,http://example.local:8080, orsocks5://example.local:8080.- Return type:
- Raises:
ValueError – If input value is invalid.
PAC JavaScript functions¶
Python implementations of JavaScript functions needed to execute a PAC file.
These are injected into the JavaScript execution context. They aren’t meant to be called directly from Python, so the function signatures may look unusual.
Most docstrings below are adapted from http://findproxyforurl.com/netscape-documentation/.
- pypac.parser_functions.dateRange(*args)[source]¶
Accepted forms:
dateRange(day)dateRange(day1, day2)dateRange(mon)dateRange(month1, month2)dateRange(year)dateRange(year1, year2)dateRange(day1, month1, day2, month2)dateRange(month1, year1, month2, year2)dateRange(day1, month1, year1, day2, month2, year2)dateRange(day1, month1, year1, day2, month2, year2, gmt)
dayis the day of month between 1 and 31 (as an integer).
month- is one of the month strings:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
yearis the full year number, for example 1995 (but not 95). Integer.
gmtis either the string “GMT”, which makes time comparison occur in GMT timezone; if left unspecified, times are taken to be in the local timezone.
Even though the above examples don’t show, the “GMT” parameter can be specified in any of the 9 different call profiles, always as the last parameter.
If only a single value is specified (from each category:
day,month,year), the function returns a true value only on days that match that specification. If both values are specified, the result is true between those times, including bounds.- Return type:
- pypac.parser_functions.dnsResolve(host)[source]¶
Resolves the given DNS hostname into an IP address, and returns it in the dot separated format as a string. Returns an empty string if there is an error
- pypac.parser_functions.isInNet(host, pattern, mask)[source]¶
Pattern and mask specification is done the same way as for SOCKS configuration.
- Parameters:
host (str) – a DNS hostname, or IP address. If a hostname is passed, it will be resolved into an IP address by this function.
pattern (str) – an IP address pattern in the dot-separated format
mask (str) – mask for the IP address pattern informing which parts of the IP address should be matched against. 0 means ignore, 255 means match.
- Returns:
True iff the IP address of the host matches the specified IP address pattern.
- Return type:
- pypac.parser_functions.myIpAddress()[source]¶
- Returns:
the IP address of the host that the Navigator is running on, as a string in the dot-separated integer format.
- Return type:
- pypac.parser_functions.shExpMatch(host, pattern)[source]¶
Case-insensitive host comparison using a shell expression pattern.
- pypac.parser_functions.timeRange(*args)[source]¶
Accepted forms:
timeRange(hour)timeRange(hour1, hour2)timeRange(hour1, min1, hour2, min2)timeRange(hour1, min1, sec1, hour2, min2, sec2)timeRange(hour1, min1, sec1, hour2, min2, sec2, gmt)
houris the hour from 0 to 23. (0 is midnight, 23 is 11 pm.)
minminutes from 0 to 59.
secseconds from 0 to 59.
gmteither the string “GMT” for GMT timezone, or not specified, for local timezone. Again, even though the above list doesn’t show it, this parameter may be present in each of the different parameter profiles, always as the last parameter.
- Returns:
True during (or between) the specified time(s).
- Return type:
- pypac.parser_functions.weekdayRange(start_day, end_day=None, gmt=None)[source]¶
Accepted forms:
weekdayRange(wd1)weekdayRange(wd1, gmt)weekdayRange(wd1, wd2)weekdayRange(wd1, wd2, gmt)
If only one parameter is present, the function yields a true value on the weekday that the parameter represents. If the string “GMT” is specified as a second parameter, times are taken to be in GMT, otherwise in local timezone.
If both
wd1and wd2`` are defined, the condition is true if the current weekday is in between those two weekdays. Bounds are inclusive. If thegmtparameter is specified, times are taken to be in GMT, otherwise the local timezone is used.Weekday arguments are one of
MON TUE WED THU FRI SAT SUN.
Proxy resolution¶
Tools for working with a given PAC file and its return values.
- class pypac.resolver.ProxyResolver(pac, proxy_auth=None, socks_scheme='socks5')[source]¶
Handles the lookup of the proxy to use for any given URL, including proxy failover logic.
- Parameters:
pac (pypac.parser.PACFile) – Parsed PAC file.
proxy_auth (requests.auth.HTTPProxyAuth) – Username and password proxy authentication. If provided, then all proxy URLs returned will include these credentials.
socks_scheme (str) – Scheme to assume for SOCKS proxies. socks5 by default. Case-insensitive.
- pypac.resolver.add_proxy_auth(possible_proxy_url, proxy_auth)[source]¶
Add a username and password to a proxy URL, if the input value is a proxy URL.
- Parameters:
possible_proxy_url (str) – Proxy URL or
DIRECT.proxy_auth (requests.auth.HTTPProxyAuth) – Proxy authentication info.
- Returns:
Proxy URL with auth info added, or
DIRECT.- Return type:
WPAD functions¶
Tools for the Web Proxy Auto-Discovery Protocol.
- pypac.wpad.proxy_urls_from_dns(local_hostname=None)[source]¶
Generate URLs from which to look for a PAC file, based on a hostname. Fully-qualified hostnames are checked against the Public Suffix List to ensure that generated URLs don’t go outside the scope of the organization. If the fully-qualified hostname doesn’t have a recognized TLD, such as in the case of intranets with ‘.local’ or ‘.internal’, the TLD is assumed to be the part following the rightmost dot.
OS stuff¶
Tools for getting the configured PAC file URL out of the OS settings.
- pypac.os_settings.ON_DARWIN = False¶
True if running on macOS/OSX.
- pypac.os_settings.ON_WINDOWS = False¶
True if running on Windows.
- pypac.os_settings.autoconfig_url_from_preferences()[source]¶
Get the PAC
AutoConfigURLvalue from the macOS System Preferences. This setting is visible as the “URL” field in System Preferences > Network > Advanced… > Proxies > Automatic Proxy Configuration.- Returns:
The value from the registry, or None if the value isn’t configured or available. Note that it may be local filesystem path instead of a URL.
- Return type:
str|None
- Raises:
NotDarwinError – If called on a non-macOS/OSX platform.
- pypac.os_settings.autoconfig_url_from_registry()[source]¶
Get the PAC
AutoConfigURLvalue from the Windows Registry. This setting is visible as the “use automatic configuration script” field in Internet Options > Connection > LAN Settings.The search order is:
HKCU Policies (per-user Group Policy)
HKLM Policies (machine Group Policy)
HKCU Normal (user preference)
HKLM Normal (machine default)
If
ProxySettingsPerUseris 0 (per-machine mode), HKCU entries are skipped.- Returns:
The value from the registry, or None if the value isn’t configured or available. Note that it may be a local filesystem path instead of a URL.
- Return type:
str|None
- Raises:
NotWindowsError – If called on a non-Windows platform.