This page describes methods to import XML dumps. A XML dump does not create a full backup of the wiki database, xml manual pdf download dump does not contain user accounts, images, edit logs, etc.

There are several methods for importing these XML dumps. Trying to import large dumps this way may result in timeouts or connection failures. Recommended method for general use, but slow for very big data sets. For very large amounts of data, such as a dump of a big Wikipedia, use mwdumper, and import the links tables as separate SQL dumps.

If the file is compressed and that has a . 2 file extension, it is decompressed automatically. Note: Optimizing of database after import is recommended: it can reduce database size in two or three times. You can also use the edit.

If available, you can fill the link tables by importing separate SQL dumps of these tables using the mysql command line client directly. For Wikimedia wikis, this data is available along with the XML dumps. This is not recommended for large data sets. Once installed on your computer, you can use the specific tool ‘pagefromfile. For example, here is an excerpt of a wiki file output by the command ‘ruby dumpxml2wiki. The name of the page is in an html comment and separated by three quotes on the same first start line.

Please notice that the name of the page can be written in Unicode. Reader parser with the ability to search documents via XPath or CSS3 selectors from the last generation of XML parsers using Ruby. WikiDAT tool, as suggested by Felipe Ortega. You can do any of the following. Remove the prefix from the interwiki table.

This will preserve page titles, but prevent interwiki linking through that prefix. Example: you will preserve page titles ‘Meta:Blah blah’ but will not be able to use the prefix ‘meta:’ to link to meta. Replace the unwanted prefix in the XML file with “Project:” before importing. This will preserve the functionality of the prefix as an interlink, but will replace the prefix in the page titles with the name of the wiki where they’re imported into, and might be quite a pain to do on large dumps.

