Pywikipedia bot with https and http authentication

I had a problem connecting my bot to MediaWiki on the intranet. I believe this is due to http authentication protecting the wiki.

Data:

  • Wiki Root: https://local.example.com/mywiki/
  • When you visit the wiki with a web browser, a pop-up window appears asking for the credentials of the enterprise (I assume this is basic access authentication)

This is what I have in user-config.py:

mylang = 'en'
family = 'mywiki'
usernames['mywiki']['en'] = u'Bot'
authenticate['local.example.com'] = ('user', 'pass')

This is what I have in mywiki_family.py:

# -*- coding: utf-8  -*-
import family, config

# The Wikimedia family that is known as mywiki
class Family(family.Family):
  def __init__(self):
      family.Family.__init__(self)
      self.name = 'mywiki'
      self.langs = { 'en' : 'local.example.com'}

  def scriptpath(self, code):
      return '/mywiki'

  def version(self, code):
      return '1.13.5'

  def isPublic(self):
      return False

  def hostname(self, code):
      return 'local.example.com'

  def protocol(self, code):
      return 'https'

  def path(self, code):
      return '/mywiki/index.php'

When I execute login.py -v -v, I get the following:

urllib2.urlopen(urllib2.Request('https://local.example.com/w/index.php?title=Special:Userlogin&useskin=monobook&action=submit', wpSkipCookieCheck=1&wpPassword=XXXX&wpDomain=&wpRemember=1&wpLoginattempt=Aanmelden%20%26%20Inschrijven&wpName=Bot, {'Content-type': 'application/x-www-form-urlencoded', 'User-agent': 'PythonWikipediaBot/1.0'})):
(Redundant traceback info here)
urllib2.HTTPError: HTTP Error 401: Unauthorized

(I'm not sure why it has "local.example.com/w" instead of "/ mywiki".)

I thought it might be an authentication attempt on example.com instead of example.com/wiki, so I changed the authentication line to:

authenticate['local.example.com/mywiki'] = ('user', 'pass')

HTTP 401.2 IIS:

, , - WWW-Authenticate, - .

, , .

. :

mywiki: ru ( " HTTP", 401, "",) : 'https://local.example.com/mywiki/index.php?title=Non-existing_page&action=edit&useskin=monobook'. , . 1 ...

HTTP urllib2.ulropen, WWW-Authenticate: Negotiate WWW-Authenticate: NTLM. , urllib2 , , pywikipedia ?

. ​​ . python-ntlm. pywikipedia?

+3
2

, , login.py '\ w' , , .

: scriptpath Family? :

class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)
        self.name = 'mywiki'
        self.langs = { 'en' : 'local.example.com'}

    def scriptpath(self, code):
        return '/mywiki'

    def version(self, code):
        return '1.13.5'

    def isPublic(self):
        return False

    def hostname(self, code):
        return 'local.example.com'

    def protocol(self, code):
        return 'https'

?

, - . python:

import wikipedia
site = wikipedia.getSite('en', 'mywiki')
print site.login_address()

, '/w' '/mywiki', , - :)

: ntlm pywikipedia?

. ​​ login.py:

response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))

- :

from ntlm import HTTPNtlmAuthHandler

user = 'DOMAIN\User'
password = "Password"
url = self.site.protocol() + '://' + self.site.hostname()

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)

# create and install the opener
opener = urllib2.build_opener(auth_NTLM)
urllib2.install_opener(opener)

response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))

codewase pywikipedia, ntlm...

, , : pywikipedia :)

+4

, , , . Python.

, , , , , , .

0

Source: https://habr.com/ru/post/1714799/


All Articles