This macro shows all Wiki pages that have links to external web sites.

I am hoping Juergen will clean up my code up a bit and add it to the MoinMoin distro. Meanwhile I am posting it here for people to check out.

Combining my url checker code on EfnetPythonWiki:JürgenHermann with the ExternalLinks macro (note the proposed name change ;) ) would be incredibly nifty. The checking should be done via a link generated by the macro (i.e. only on request by the user). --jh

I agree, but I think we should leave it for the next version. I am out of commisson for the next four or five weeks, because I'm doing a cross-country move, then I'm starting a new job. You have my permission to rename the macro to ExternalLinks (much better name) and do with it as you please.

-- SteveHowell

In the current form, I won't take it into the official code base anyway, since ripping the URLs with your own machinery is too likely to break later; and at the same time, there is no official machinery, yet. Note that I made some of the page scanning regex available in 0.111, so you can be somewhat more official after the 0.11 release.


1 Use MoinMoin.parser.wiki.Parser.word_rule for wikinames, and MoinMoin.parser.wiki.Parser.url_rule for URLs.

code

"""
    MoinMoin - OutsideLinks Macro

    Copyright (c) 2002 by Steve Howell <showell@zipcon.net>
    All rights reserved, see COPYING for details.

    Show all Wiki pages that have links to external web sites.

"""

# Imports
import re, string
from MoinMoin import config, wikiutil
from MoinMoin.Page import Page

def execute(macro, args):
    all_pages = wikiutil.getPageList(config.text_dir)
    all_pages.sort()
    result = ""
    for page in all_pages:
        if not ignore(page):
            links = getLinks(Page(page).get_raw_body())
            if len(links) > 0:
                result = result + macro.formatter.pagelink(page)
                result = result + macro.formatter.bullet_list(1)
                for link in links:
                    result = result + macro.formatter.listitem(1)
                    result = result + macro.formatter.url(link)
                    result = result + macro.formatter.listitem(0)
                result = result + macro.formatter.bullet_list(0)

    return result

def ignore(page):
    return re.search("Hermann", page) \
           or re.search("Help", page) \
           or re.search("MoinMoin", page) \
           or re.search("Wiki", page)
 
    
def getLinks(text):
    link = link_regex()
    lines = string.split(text, '\n')
    result = []
    for line in lines:
        for item in link.findall(line):
            if not re.search(r"\.gif", item[0]):
                result.append(item[0])
    return result

def link_regex():
    return re.compile(r"((%(url)s)\:([^\s\<%(punct)s]|([%(punct)s][^\s\<%(punct)s]))+)"
    % {
      'url': 'http|https|ftp|nntp|news|telnet|file',
      'punct': re.escape('''"'}]|:,.)?!'''),
     })

OutsideLinksCode (last edited 2008-03-04 08:33:09 by localhost)