Create a HashMap with a fixed key matching the HashSet. point of departure

My goal is to create a hash map using String as a key and input value as a HashSet of strings.


OUTPUT

Here is the result:

Hudson+(surname)=[Q2720681], Hudson,+Quebec=[Q141445], Hudson+(given+name)=[Q5928530], Hudson,+Colorado=[Q2272323], Hudson,+Illinois=[Q2672022], Hudson,+Indiana=[Q2710584], Hudson,+Ontario=[Q5928505], Hudson,+Buenos+Aires+Province=[Q10298710], Hudson,+Florida=[Q768903]]

In my idea, it should look like this:

[Hudson+(surname)=[Q2720681,Q141445,Q5928530,Q2272323,Q2672022]]

The goal is to store a specific name on Wikidata, and then all the Q values ​​associated with it can be as follows:

This is the page for Bush.

I want Bush to be the Key, and then for all the different points of origin, all the different ways that they Bushcould be connected to the Wikidata terminal page, I want to save the corresponding Q value or unique alphanumeric identifier.

, , - , wikipedia, - , wikidata.

, Bush :

George H. W. Bush 
George W. Bush
Jeb Bush
Bush family
Bush (surname) 

Q:

. (Q23505)

. (Q207)

(Q221997)

(Q2743830)

Bush (Q1484464)

,

Key: Bush : Q23505, Q207, Q221997, Q2743830, Q1484464

, , .

Q. .

Key: Jeb Bush : Q221997

Key: George W. Bush : Q207

..

github, .

, , :

// add Q values to their arrayList in the hash map at the index of the appropriate entity
public static HashSet<String> put_to_hash(String key, String value) 
{
    if (!q_valMap.containsKey(key)) 
    {
        return q_valMap.put(key, new HashSet<String>() );
    }
    HashSet<String> list = q_valMap.get(key);
    list.add(value);
    return q_valMap.put(key, list);
}

:

    while ((line_by_line = wiki_data_pagecontent.readLine()) != null) 
    {
        // if we can determine it a disambig page we need to send it off to get all 
        // the possible senses in which it can be used.
        Pattern disambig_pattern = Pattern.compile("<div class=\"wikibase-entitytermsview-heading-description \">Wikipedia disambiguation page</div>");
        Matcher disambig_indicator = disambig_pattern.matcher(line_by_line);
        if (disambig_indicator.matches()) 
        {
            //off to get the different usages
            Wikipedia_Disambig_Fetcher.all_possibilities( variable_entity );
        }
        else
        {
            //get the Q value off the page by matching
            Pattern q_page_pattern = Pattern.compile("<!-- wikibase-toolbar --><span class=\"wikibase-toolbar-container\"><span class=\"wikibase-toolbar-item " +
                    "wikibase-toolbar \">\\[<span class=\"wikibase-toolbar-item wikibase-toolbar-button wikibase-toolbar-button-edit\"><a " +
                    "href=\"/wiki/Special:SetSiteLink/(.*?)\">edit</a></span>\\]</span></span>");

            Matcher match_Q_component = q_page_pattern.matcher(line_by_line);
            if ( match_Q_component.matches() ) 
            {
                String Q = match_Q_component.group(1);

                // 'Q' should be appended to an array, since each entity can hold multiple
                // Q values on that basis of disambig
                put_to_hash( variable_entity, Q );
            }
        }

    }

:

public static void all_possibilities( String variable_entity ) throws Exception
{
    System.out.println("this is a disambig page");
    //if it a disambig page we know we can go right to the wikipedia


    //get it normal wiki disambig page
    Document docx = Jsoup.connect( "https://en.wikipedia.org/wiki/" + variable_entity ).get();



        //this can handle the less structured ones. 
        Elements linx = docx.select( "p:contains(" + variable_entity + ") ~ ul a:eq(0)" );

        for (Element linq : linx) 
        {
            System.out.println(linq.text());
            String linq_nospace = linq.text().replace(' ', '+');
            Wikidata_Q_Reader.getQ( linq_nospace );

        }

}

, , Key, . . , - , .

+1
1

, , . , (HashMap String Set<String>) , "".

public static HashSet<String> put_to_hash(String key, String value) 
{
    if (!q_valMap.containsKey(key)) 
    {
        return q_valMap.put(key, new HashSet<String>() );
    }
    HashSet<String> list = q_valMap.get(key);
    list.add(value);
    return q_valMap.put(key, list);
}

, (if (!q_valMap.containsKey(key))), HashSet , value . ( - , .) , Q .

, , , . , . ( valSet, , . , ).

public static HashSet<String> put_to_hash(String key, String value) 
{
    if (!q_valMap.containsKey(key)) {
        q_valMap.put(key, new HashSet<String>());
    } 
    HashSet<String> valSet = q_valMap.get(key);
    valSet.add(value);
    return valSet;
}

, Set - Set , , , , .

Guava Multimap, .

+2

Source: https://habr.com/ru/post/1583996/


All Articles