PmWiki: Size

Q: What is the largest PmWiki site in number of pages?

Pm: Of course there's no theoretical limit to the number of pages, because
of the way that PmWiki organizes them.

The practical limits are on the number of files that can be organized in
a single directory (wiki.d), and the time it takes to search through the
pages. As a couple of reference points, the SciTechWiki
(http://www.sci.tamucc.edu/wiki) currently has 1649 pages and
the TAMUCC Wiki (http://www.tamucc.edu/wiki) has 4500+ pages and both of
them seem to run without any problem; even searches don't seem to take
unreasonable amounts of time. Both of these wikis are running under
Red Hat Linux. I know that under RH Linux I have some directories with
5000+ files in them with no major difficulties (mailing list archive
directories), so I don't think the OS will pose a barrier on PmWiki
at least to that point.

However, I did think about these problems a bit when designing PmWiki
and here's what I've decided thus far. First, as far as directory
limitations go, I designed the $PageFileFmt, $WikiDir, and $WikiLibDirs
variables so that an admin can organize the files into an alternate
structure other than just "wiki.d/Group.Pagename". For example,
$PageFileFmt could be changed to '$Group/$Title_' or '$Group/$PageName'
and files would be stored as "wiki.d/Group/Pagename" or
"wiki.d/Group/Group.Pagename", which would reduce the overall number
of files in any single directory. Something would still have to create
the directories for the groups, but this isn't a big problem. But even if
this approach doesn't break things up sufficiently (e.g., a single
group with thousands of pages) there are still other options--with a
couple of very minor extensions the files can be organized into
directories based on the first character(s) of the title, as in
"wiki.d/P/Group.Pagename" and "wiki.d/W/Group.WikiWord".
This would spread the load out among more directories as well.

And, of course, I could always look at relational databases or
other indexing schemes if it became necessary to do so. However,
I like simplicity, and this is definitely one of those areas where
I've chosen to avoid gratuitous features; i.e., take the simple
approach for now and build the complex implementation only
when a real demonstrated need arises, at which point the real
parameters of the problem are better known. I'm also quite
comfortable that if we need to change the back-end storage model at
some point it'll be easy to create a migration path from the existing
schema to a new one.

Note: SearchWiki is becoming more sophisticated as PmWiki is growing. Advanced search features are being added to PmWiki. If the search features included in PmWiki are not sufficient, there are already some excellent site indexing and retrieval
systems available, and it's much more effective to make use of the
existing packages rather than try to duplicate that functionality
into PmWiki.

<< PmWiki.Contributors | PmWiki.DocumentationIndex | >>
本頁更新於 2003 年 12 月 31 日,04:18 AM