2012-11-29 50 views
2

我是Solr中的新成員,並且想索引由PhpMyAdmin創建的XML文件。但是,該文件具有表格和列,Solr不會索引它,只要我把它放在示例目錄中。使用Solr從phpMyAdmin索引XML文件

============================================== =======================

<?xml version="1.0" encoding="utf-8"?> 
<!-- 
- phpMyAdmin XML Dump 
- version 3.5.1 
- http://www.phpmyadmin.net 
- 
- Host: localhost 
- Generation Time: Nov 22, 2012 at 07:33 AM 
- Server version: 5.5.24-log 
- PHP Version: 5.3.13 
--> 

<pma_xml_export version="1.0" xmlns:pma="http://www.phpmyadmin.net/some_doc_url/"> 
    <!-- 
    - Structure schemas 
    --> 
    <pma:structure_schemas> 
     <pma:database name="blog" collation="latin1_swedish_ci" charset="latin1"> 
      <pma:table name="post"> 
       CREATE TABLE `post` (
        `post_id` int(11) NOT NULL, 
        `Title` varchar(50) NOT NULL, 
        `Author` varchar(50) NOT NULL, 
        `Status` varchar(15) NOT NULL, 
        `Date` date NOT NULL, 
        `Time` time NOT NULL, 
        `Text` varchar(1000) NOT NULL, 
        `Category` varchar(25) NOT NULL, 
        `Tags` varchar(10000) NOT NULL, 
        `Links` varchar(10000) NOT NULL, 
        `Ratings` int(11) NOT NULL, 
        PRIMARY KEY (`post_id`), 
        UNIQUE KEY `post_id` (`post_id`), 
        UNIQUE KEY `post_id_2` (`post_id`) 
       ) ENGINE=InnoDB DEFAULT CHARSET=latin1; 
      </pma:table> 
     </pma:database> 
    </pma:structure_schemas> 

    <!-- 
    - Database: 'blog' 
    --> 
    <database name="blog"> 
     <!-- Table post --> 
     <table name="post"> 
      <column name="post_id">1</column> 
      <column name="Title">Human Interface device to Com-port</column> 
      <column name="Author">n72.241</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-03-19</column> 
      <column name="Time">10:00:09</column> 
      <column name="Text">Is there a way to force input from USB HID into specific com-port?</column> 
      <column name="Category">Human Interface</column> 
      <column name="Tags">interface, com-port, device, human interface</column> 
      <column name="Links">www.something.com</column> 
      <column name="Ratings">8</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">2</column> 
      <column name="Title">Human Interface device to Com-port</column> 
      <column name="Author">Narmeen</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-03-19</column> 
      <column name="Time">10:15:30</column> 
      <column name="Text">What do you exactly mean? serial data throughput is thousands of time slower then usb</column> 
      <column name="Category"></column> 
      <column name="Tags"></column> 
      <column name="Links"></column> 
      <column name="Ratings">0</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">3</column> 
      <column name="Title">Human Interface device to Com-port</column> 
      <column name="Author">orb</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-03-19</column> 
      <column name="Time">10:25:30</column> 
      <column name="Text">on hardware/firmware level or OS/driver level, and if OS/driver, then what OS?</column> 
      <column name="Category"></column> 
      <column name="Tags"></column> 
      <column name="Links"></column> 
      <column name="Ratings">0</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">4</column> 
      <column name="Title">Human Interface device to Com-port</column> 
      <column name="Author">someone</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-03-19</column> 
      <column name="Time">11:00:00</column> 
      <column name="Text">Im putting some long text to see how its looks on the main site.A human interface device or HID is a type of computer device that interacts directly with, and most often takes input from, humans and may deliver output to humans. The term &quot;HID&quot; most commonly refers to the USB-HID specification. The term was coined by Mike Van Flandern of Microsoft when he proposed the USB committee create a Human Input Device class working group.[when?] The working group was renamed as the Human Interface Device class at the suggestion of Tom Schmidt of DEC because the proposed standard supported bi-directional communication.[when?]ww/</column> 
      <column name="Category"></column> 
      <column name="Tags"></column> 
      <column name="Links"></column> 
      <column name="Ratings">0</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">5</column> 
      <column name="Title">Human Interface device to Com-port</column> 
      <column name="Author">n72.241</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-11-08</column> 
      <column name="Time">11:15:00</column> 
      <column name="Text">Human interface guidelines (HIG) are software development documents which offer application developers a set of recommendations. Their aim is to improve the experience for the users by making application interfaces more intuitive, learnable, and consistent. Most guides limit themselves to defining a common look and feel for applications in a particular desktop environment. The guides enumerate specific policies. Policies are sometimes based on studies of human-computer interaction (so called usability studies), but most are based on arbitrary conventions chosen by the platform developers. The central aim of a HIG is to create a consistent experience across the environment (generally an operating system or desktop environment), including the applications and other tools being used. This means both applying the same visual design and creating consistent access to and behaviour of common elements of the interface - from simple ones such as buttons and icons up to more complex construction</column> 
      <column name="Category">Human interface</column> 
      <column name="Tags">human cateogy text checking</column> 
      <column name="Links">something.com</column> 
      <column name="Ratings">8</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">6</column> 
      <column name="Title">some other things</column> 
      <column name="Author">me</column> 
      <column name="Status">not answered</column> 
      <column name="Date">2012-11-30</column> 
      <column name="Time">10:00:00</column> 
      <column name="Text">Rommendations and advice meant to help developers create better applications. Developers sometimes intentionally choose to break them if they think that the guidelines do not fit their application, or usability testing reveals an advantage in doing so. But in turn, the organization publishing the HIG might withhold endorsement of the application. Mozilla Firefox's user interface, for example, goes against the GNOME project's HIG, which is one of the main arguments for</column> 
      <column name="Category">Not right</column> 
      <column name="Tags">here, there anywhere</column> 
      <column name="Links">checking.com</column> 
      <column name="Ratings">5</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">7</column> 
      <column name="Title">some other things again</column> 
      <column name="Author">xyz</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-12-29</column> 
      <column name="Time">12:00:00</column> 
      <column name="Text">Human interface guidelines often describe the visual design rules, including icon and window design and style. Frequently they specify how user input and interaction mechanisms work. Aside from the detailed rules, guidelines sometimes also make broader suggestions about how to organize and design the application and write user-interface text. 
HIGs are also done for applications. In this case the HIG will build on a platform HIG by adding the common semantics for a range of application functions</column> 
      <column name="Category">nothing</column> 
      <column name="Tags">sfksdjghsklgjlsgj</column> 
      <column name="Links">something.com</column> 
      <column name="Ratings">0</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">8</column> 
      <column name="Title">noting to say</column> 
      <column name="Author">na</column> 
      <column name="Status"></column> 
      <column name="Date">2012-11-16</column> 
      <column name="Time">00:00:00</column> 
      <column name="Text">what if this dosent works then what will i do now here</column> 
      <column name="Category">sdfsdfs</column> 
      <column name="Tags"></column> 
      <column name="Links"></column> 
      <column name="Ratings">0</column> 
     </table> 
     <table name="post"> 
      <column name="post_id">9</column> 
      <column name="Title">checkinf for time</column> 
      <column name="Author">na</column> 
      <column name="Status">Answered</column> 
      <column name="Date">2012-10-10</column> 
      <column name="Time">09:00:00</column> 
      <column name="Text">hoping this works now</column> 
      <column name="Category">nothing</column> 
      <column name="Tags">afjalfjaf</column> 
      <column name="Links">kdflsdfj</column> 
      <column name="Ratings">8</column> 
     </table> 
    </database> 
</pma_xml_export> 

回答

0

首先,爲了索引XML文件,您應該將其轉換爲Solr文檔格式,如下所示。

<add> 
    <doc> 
     <field ... 
     <field ... 
    </doc> 
    <doc> 
     <field ... 
     <field ... 
    </doc> 
</add> 

或者,您可以通過從關係數據庫獲取數據使用Data import Hander來索引數據。

+0

非常感謝Parvin,那麼如何將它轉換爲該模型?手動或使用任何特定應用程序? – nematy

+0

我以前沒有使用它,但有Solr的PHP客戶端。看看他們。他們可以幫助。 https://github.com/basdenooijer/solarium –