Tutorial 5: Annotation tools

By the time you have worked through this tutorial you will be able to:

  • Parse a CellML file into a Model instance;

  • Determine the type of item which has a given id;

  • Use the Annotator class to retrieve an item using only its id string;

  • Repair duplicated id strings within the model scope; and

  • Automatically generate and assign unique ids to any or all items.

Requirements

Either download the entire folder, or:

C++

Python

Resources

Background

“Marco Polo” is a game played with many people in a swimming pool. One person calls “Marco” with their eyes closed. Others answer “Polo” and the first person must find them by following the sound. In this tutorial you are given id strings and a mystery CellML model file. We will work through how the Annotator class can be used to locate the desired objects.

Step 1: Parse the mystery model

1.a Read the file “MysteryModel.cellml” into a string.

1.b Create a Parser item.

1.c Use the parser to deserialise the contents of the string you’ve read and return the model.

1.d Check that the parser has not raised any issues.

Show C++ snippet

    //  1.a 
    //      Read the mystery file, MysteryModel.cellml.
    std::ifstream inFile("MysteryModel.cellml");
    std::stringstream inFileContents;
    inFileContents << inFile.rdbuf();

    //  1.b 
    //      Create a Parser item.
    auto parser = libcellml::Parser::create();

    //  1.c 
    //      Use the parser to deserialise the contents of the string you've read
    //      and return the model.
    auto model = parser->parseModel(inFileContents.str());

    //  1.d 
    //      Check that the parser has not raised any issues.
    printIssues(parser);

Show Python snippet

    #  1.a 
    #      Read the mystery file, MysteryModel.cellml.
    read_file = open("MysteryModel.cellml")

    #  1.b 
    #      Create a Parser item.
    parser = Parser()

    #  1.c 
    #      Use the parser to deserialise the contents of the string you've read
    #      and return the model.
    model = parser.parseModel(read_file.read())

    #  1.d 
    #      Check that the parser has not raised any issues.
    print_issues(parser)

Step 2: Retrieve an item with a unique id

Annotator class

Tutorial functions

  • C++: getCellmlElementTypeFromEnum will return a string version of the CellmlElementType enumeration

  • Python: get_cellml_element_type_from_enum

2.a Create an Annotator item and use its setModel function to pass in the parsed mystery model.

  • In C++: The item function returns a libcellml::AnyItem, a std::pair whose first attribute is a libcellml::CellmlElementType enumeration; and second attribute is a std::any cast of the item itself.

  • In Python: The item function returns a tuple. The first item is a CellmlElementType enumeration, the second is the item itself.

Show C++ snippet

    //  2.a
    //      Create an Annotator item and use the setModel function to pass in the parsed
    //      mystery model.
    auto annotator = libcellml::Annotator::create();
    annotator->setModel(model);

Show Python snippet

    #  2.a
    #      Create an Annotator item and use the setModel function to pass in the parsed
    #      mystery model.
    annotator = Annotator()
    annotator.setModel(model)

2.b Retrieve the item with an id of “marco”. Use the helper function to convert the enumeration of its type into a string, and print to the terminal.

The item with ID 'marco' is a VARIABLE

2.c Check that the annotator has not reported any issues.

2.d (C++ only) Cast the second attribute of the macro item into a libcellml::VariablePtr item using std::any_cast.

Show C++ snippet

    //  2.b
    //      Retrieve the item with an id of "marco".  Use the helper function
    //      getCellmlElementTypeFromEnum to convert the enumeration of its type into a
    //      string for printing to the terminal.
    libcellml::AnyItem marcoItem = annotator->item("marco");
    std::cout << "The item with ID 'marco' is a " << getCellmlElementTypeFromEnum(marcoItem.first) << std::endl;

    //  2.c
    //      Check that the annotator has not reported any issues.
    printIssues(annotator);

    //  2.d
    //      Now that we know the marco item's type using its first attribute (it should
    //      be a libcellml::CellmlElementType::VARIABLE) we can cast it into a usable item
    //      using std::any_cast.  Cast the second attribute of the macro item into a
    //      libcellml::VariablePtr item.
    auto marcoVariable = std::any_cast<libcellml::VariablePtr>(marcoItem.second);

Show Python snippet

    #  2.b
    #      Retrieve the item with an id of 'marco'.  Use the helper function
    #      get_cellml_element_type_from_enum to convert the enumeration of its type into a
    #      string for printing to the terminal.
    marco_item = annotator.item('marco')
    print('The item with ID "marco" is a {}'.format(get_cellml_element_type_from_enum(marco_item[0])))

    # The item with ID 'marco' is a VARIABLE

    #  2.c
    #      Check that the annotator has not reported any issues.
    print_issues(annotator)

    #  2.d
    #      Now that we know the marco item's type using its first attribute (it should
    #      be a CellmlElementType.VARIABLE) we can name its second attribute so we know
    #      what it is.
    marco_variable = marco_item[1]

Step 3: Retrieve items whose id are not unique

3.a Now try the same procedure to find the item with id of “polo”. Retrieve the item and print its type to the terminal.

The type of item with ID "polo" is UNDEFINED

3.b The item type returned is UNDEFINED … so we need to check what the annotator has to say about it. Retrieve the issues from the annotator and print them to the terminal.

Show C++ snippet

    //  3.a
    //      Now try the same procedure to find the item with id of "polo".
    //      Retrieve the item and print its type to the terminal.
    auto poloItem = annotator->item("polo");
    std::cout << "The type of item with ID 'polo' is " << getCellmlElementTypeFromEnum(poloItem.first) << std::endl;

    //  3.b
    //      The item type returned is libcellml::CellmlElementType::UNDEFINED ... so we 
    //      need to check what the annotator has to say about it. 
    //      Retrieve the issues from the annotator and print to the terminal.
    printIssues(annotator);

Show Python snippet

    #  3.a
    #      Now try the same procedure to find the item with id of 'polo'.
    #      Retrieve the item and print its type to the terminal.
    polo_item = annotator.item('polo')
    print('The type of item with ID "polo" is {}'.format(get_cellml_element_type_from_enum(polo_item[0])))

    #  3.b
    #      The item type returned is CellmlElementType.UNDEFINED ... so we 
    #      need to check what the annotator has to say about it. 
    #      Retrieve the issues from the annotator and print to the terminal.
    print_issues(annotator)

Recorded 1 issues:
Issue [0] is a WARNING:
    description: The id 'polo' occurs 6 times in the model so a unique item cannot be located.
    stored item type: UNDEFINED

3.c Since the id is not unique, we need to retrieve all items with that id for investigation. Use the items function to retrieve the vector of items with id “polo”, and iterate through it printing the different types to the terminal.

Show C++ snippet

    //  3.c
    //      Use the items function to retrieve the vector of items with id "polo", 
    //      and iterate through it printing the different types to the terminal.
    auto poloItems = annotator->items("polo");
    std::cout << "The items with an id of 'polo' have types of:" << std::endl;
    size_t index = 0;
    for (auto &item : poloItems) {
        std::cout << "  - [" << index << "] " << getCellmlElementTypeFromEnum(item.first) << std::endl;
        ++index; 
    }

Show Python snippet

    #  3.c
    #      Use the items function to retrieve the vector of items with id 'polo', 
    #      and iterate through it printing the different types to the terminal.
    polo_items = annotator.items('polo')
    print('The items with an id of "polo" have types of:')
    index = 0
    for item in polo_items:
        print('  - [{}] {}'.format(index, get_cellml_element_type_from_enum(item[0])))
        index += 1

The items with an id of 'polo' have types of:
  - [0] UNITS
  - [1] UNITS
  - [2] UNIT
  - [3] VARIABLE
  - [4] RESET
  - [5] RESET_VALUE

The item we want has type UNIT, and we’d like it to be unique so that we can annotate it properly. We need to change the other items to have other (also unique) ids. The Annotator class can create a unique id for an item using the assignId function.

3.d Assign an automatic id to all of the items with id “polo”, except for the one whose type is UNIT.

3.e Check that the id of “polo” is now unique in the model by calling the isUnique function.

Show C++ snippet

    //  3.d
    //      Assign an automatic id to all of the items with id "polo", except for the one whose
    //      type is UNIT.
    poloItem = poloItems.at(2);
    assert(poloItem.first == libcellml::CellmlElementType::UNIT);
    poloItems.erase(poloItems.begin() + 2);

    for (auto &item : poloItems) {
        annotator->assignId(item);
    }

    //  3.e
    //      Check that the id of "polo" is now unique in the model by calling the 
    //      isUnique function.
    assert(annotator->isUnique("polo"));

Show Python snippet

    #  3.d
    #      Assign an automatic id to all of the items with id 'polo', except for the one whose
    #      type is UNIT.
    polo_unit = polo_items.pop(2)
    for item in polo_items:
        annotator.assignId(item)

    #  3.e
    #      Check that the id of 'polo' is now unique in the model by calling the 
    #      isUnique function.
    assert(annotator.isUnique('polo'))

Now we know that there is only one item in the model with id “polo”, and we also know that it has type UNIT. In circumstances where you know the type of the item with the id you’re fetching ahead of time, you can retrieve it without the need to cast using the direct functions according to type: these are listed below.

3.f Retrieve the Unit with id “polo” directly. The Unit class has two attributes:

  • units() returns the parent Units item; and

  • index() returns the index of this unit within its parent.

Show C++ snippet

    //  3.f
    //      Retrieve the Unit with id polo without casting.
    auto poloUnit = annotator->unit("polo");

Show Python snippet

    #  3.f
    #      Retrieve the Unit with id polo without casting.
    polo_unit = annotator.unit('polo')

Step 4: Discover items whose ids are unknown

Now that we’ve found Marco and fixed the duplicates of Polo, we’d like to know what other ids are being used in this model.

4.a Use the ids function to return a vector of id strings used in the model, and print them to the terminal.

Show C++ snippet

    //  4.a
    //      Use the Annotator::ids function to return a vector of id strings used in the model, and 
    //      print them to the terminal.
    std::cout << "The id strings used in the model are:" << std::endl;
    auto ids = annotator->ids();
    for(auto &id :ids) {
        std::cout << "  - '"<< id << "'" << std::endl;
    }

Show Python snippet

    #  4.a
    #      Use the Annotator.ids function to return a vector of id strings used in the model, and 
    #      print them to the terminal.
    print('The id strings used in the model are:')
    ids = annotator.ids()
    for id in ids:
        print('  - "{}"'.format(id))

The id strings used in the model are:
    - "b4da55"
    - "b4da56"
    - "b4da57"
    - "b4da58"
    - "b4da59"
    - "i_am_a_component"
    - "marco"
    - "me_too"
    - "polo"
    - "someOtherDuplicatedId"
    - "someOtherId"
    - "whoAmIAndWhereDidIComeFrom"

The hex strings printed are those which have been automatically generated by the assignId function; we can also see the “marco” and “polo” ids as expected.

4.b Use the duplicateIds function to return a vector of those ids which have been duplicated in the model. Use the itemCount function to return the number of times each occurs, and print to the terminal.

Show C++ snippet

    //  4.b
    //      Use the duplicateIds function to return a vector of those ids which have been duplicated in 
    //      the model, and print them to the terminal.
    std::cout << "Duplicated id strings are:" << std::endl;
    auto duplicatedIds = annotator->duplicateIds();
    for(auto &id :duplicatedIds) {
        std::cout << "  - '" << id << "' occurs " << annotator->itemCount(id) << "times." << std::endl;
    }

Show Python snippet

    #  4.b
    #      Use the duplicateIds function to return a vector of those ids which have been duplicated in 
    #      the model, and print them to the terminal.
    print('Duplicated id strings are:')
    duplicated_ids = annotator.duplicateIds()
    for id in duplicated_ids:
        print('  - "{}" occurs {} times'.format(id, annotator.itemCount(id)))

Duplicated id strings are:
- "someOtherDuplicatedId" occurs 3 times

Step 5: Trace provenance of imported items

The final step is to make sure that imported items can have their annotations tracked back to their sources too.

5.a Retrieve an item with id of “whoAmIAndWhereDidIComeFrom” and print its item type to the terminal.

The type of item with ID "whoAmIAndWhereDidIComeFrom" is UNITS

5.b Cast it into a CellML item of the appropriate type.

5.c Use its isImport() function to verify that it is imported.

5.d Create an Importer instance and use it to resolve this model’s imports. Check that it has not raised any issues.

Show C++ snippet

    //  5.a
    //      Retrieve an item with id of "whoAmIAndWhereDidIComeFrom" and print its item type
    //      to the terminal.
    auto whoAmIAndWhereDidIComeFrom = annotator->item("whoAmIAndWhereDidIComeFrom");
    std::cout << "The type of item with ID 'whoAmIAndWhereDidIComeFrom' is " << getCellmlElementTypeFromEnum(whoAmIAndWhereDidIComeFrom.first) << std::endl;
    
    //  5.b
    //      Cast it into a CellML item of the appropriate type.
    auto units = std::any_cast<libcellml::UnitsPtr>(whoAmIAndWhereDidIComeFrom.second);

    //  5.c
    //      Use the Component::isImport() function to verify that it is imported.
    assert(units->isImport());

    //  5.d
    //      Create an Importer instance and use it to resolve this model's imports.
    //      Check that it has not raised any issues.
    auto importer = libcellml::Importer::create();
    importer->resolveImports(model, "");
    printIssues(importer);

Show Python snippet

    #  5.a
    #      Retrieve an item with id of 'whoAmIAndWhereDidIComeFrom' and print its item type
    #      to the terminal.
    who_am_i = annotator.item('whoAmIAndWhereDidIComeFrom')
    print('The type of item with ID "whoAmIAndWhereDidIComeFrom" is {}'.format(get_cellml_element_type_from_enum(who_am_i[0])))
    
    #  5.b
    #      Cast it into a CellML item of the appropriate type.
    units = who_am_i[1]

    #  5.c
    #      Use the Component.isImport() function to verify that it is imported.
    assert(units.isImport())

    #  5.d
    #      Create an Importer instance and use it to resolve this model's imports.
    #      Check that it has not raised any issues.
    importer = Importer()
    importer.resolveImports(model, '')
    print_issues(importer)

5.e Retrieve all the information needed to locate any annotations on the original item:

  • the URL from which it was imported; and

  • the id of the item in the original model.

Print these to the terminal.

Show C++ snippet

    //  5.e
    //      Retrieve all the information needed to locate any annotations on the 
    //      original item:
    //           - the URL from which it was imported; and
    //           - the id of the item in the original model.
    //      Print these to the terminal.
    auto url = units->importSource()->url();
    auto reference = units->importReference();
    auto importedId = units->importSource()->model()->units(reference)->id();

    std::cout << "The units with id 'whoAmIAndWhereDidIComeFrom' came from:" << std::endl;
    std::cout << "  - url: " << url << std::endl;
    std::cout << "  - id: " << importedId << std::endl;
    

Show Python snippet

    #  5.e
    #      Retrieve all the information needed to locate any annotations on the 
    #      original item:
    #           - the URL from which it was imported and
    #           - the id of the item in the original model.
    #      Print these to the terminal.
    url = units.importSource().url()
    reference = units.importReference()
    imported_id = units.importSource().model().units(reference).id()

    print('The units with id "whoAmIAndWhereDidIComeFrom" came from:')
    print('  - url: {}'.format(url))
    print('  - id: {}'.format(imported_id))
    
The units with id "whoAmIAndWhereDidIComeFrom" came from:
- url: AnotherMysteryModel.cellml
- id: i_am_a_units_item

Step 6: Bulk operations

6.a Loop through all of the model’s components and print their id to the terminal. Use the assignIds function with an item type of libcellml::CellmlElementType::COMPONENT to give all of the items of that type a new unique id. Print the ids again and notice that the blanks have been filled with automatically generated strings, but existing ids are unchanged.

Show C++ snippet

    //  6.a
    //      Loop through all of the model's components and print their id to the terminal.
    //      Use the assignIds string with an item type (libcellml::CellmlElementType::COMPONENT)
    //      to give all of the items of that type a new unique id.  Print the ids again and
    //      notice that the blanks have been filled with automatically generated strings, 
    //      but existing ids are unchanged. 
    std::cout << "Before automatic assigning the components have ids:" << std::endl;
    for(size_t i = 0; i < model->componentCount(); ++i) {
        std::cout << "  - '" << model->component(i)->id() << "'" << std::endl;
    }

    annotator->assignIds(libcellml::CellmlElementType::COMPONENT);

    std::cout << "After automatic assigning components have ids:" << std::endl;
    for(size_t i = 0; i < model->componentCount(); ++i) {
        std::cout << "  - '" << model->component(i)->id() << "'" <<std::endl;
    }

Show Python snippet

    #  6.a
    #      Loop through all of the model's components and print their id to the terminal.
    #      Use the assignIds string with an item type (CellmlElementType.COMPONENT)
    #      to give all of the items of that type a new unique id.  Print the ids again and
    #      notice that the blanks have been filled with automatically generated strings, 
    #      but existing ids are unchanged. 
    print('Before automatic assignment the components have ids:')
    for index in range(0, model.componentCount()):
        print('  - "{}"'.format(model.component(index).id()))

    annotator.assignIds(CellmlElementType.COMPONENT)

    print('After automatic assignment the components have ids:')
    for index in range(0, model.componentCount()):
        print('  - "{}"'.format(model.component(index).id()))

Before automatic assignment the components have ids:
    - "i_am_a_component"
    - ""
    - ""
    - ""
    - "me_too"
    - ""

After automatic assignment the components have ids:
    - "i_am_a_component"
    - "b4da5a"
    - "b4da5b"
    - "b4da5c"
    - "me_too"
    - "b4da5d"

Finally, we decide that it’s too cold for swimming, and want to nuke all the ids and go home.

6.b Use the clearAllIds function to completely remove all id strings from the model. Check that they have gone by repeating step 4.a to print any ids to the terminal.

There are 0 ids in the model.

Go looking for Marco, but he’s gone home already.

6.c Retrieve the item with id “marco” and print its type to the terminal. Retrieve and print any issues in the annotator to the terminal.

The type of item with ID "marco" is UNDEFINED

The Annotator has found 1 issues:
Warning[0]:
    Description: Could not find an item with an id of 'marco' in the model.

Now you regret nuking our friends and make plans to return tomorrow and annotate everything.

6.d Use the assignAllIds function to give an automatic id to everything which doesn’t already have one (which is everything now!).

6.e Try to retrieve duplicated ids from the annotator as in step 4.b, and check that it returns an empty list.

There are 0 duplicated ids in the model.

Show C++ snippet

    //  6.b
    //      Finally, we decide that it's too cold for swimming, and want to nuke all the ids
    //      and go home.
    //      Use the clearAllIds function to completely remove all id strings from the model.
    //      Check that they have gone by repeating step 4.a to print any ids to the terminal.
    annotator->clearAllIds();
    ids = annotator->ids();
    std::cout << "There are " << ids.size() << " ids in the model." << std::endl;

    //  6.c
    //      Go looking for Marco, but he's gone home already.
    //      Try and retrieve an item with id "marco" and check that a null pointer is returned.
    //      Retrieve and print any issues to the terminal.
    marcoItem = annotator->item("marco");
    std::cout << "The type of item with ID 'marco' is " << getCellmlElementTypeFromEnum(marcoItem.first) << std::endl;
    printIssues(annotator);

    //  6.d
    //      Regret nuking our friends and make plans to return tomorrow and
    //      annotate everything.  Use the assignAllIds function to give an automatic
    //      id to everything in the model.
    annotator->assignAllIds();

    //  6.e
    //      Try to retrieve duplicated ids from the annotator as in step 4.b, and
    //      check that it returns an empty list.
    duplicatedIds = annotator->duplicateIds();
    std::cout << "There are " << duplicatedIds.size() << " duplicated ids left in the model." << std::endl;

Show Python snippet

    #  6.b
    #      Finally, we decide that it's too cold for swimming, and want to nuke all the ids
    #      and go home.
    #      Use the clearAllIds function to completely remove all id strings from the model.
    #      Check that they have gone by repeating step 4.a to print any ids to the terminal.
    annotator.clearAllIds()
    ids = annotator.ids()
    print('There are {} ids in the model.'.format(len(ids)))

    #  6.c
    #      Go looking for Marco, but he's gone home already.
    #      Try and retrieve an item with id 'marco' and check that a null pointer is returned.
    #      Retrieve and print any issues to the terminal.
    marco_item = annotator.item('marco')
    print('The type of item with ID "marco" is {}'.format(get_cellml_element_type_from_enum(marco_item[0])))
    print_issues(annotator)

    #  6.d
    #      Regret nuking our friends and make plans to return tomorrow and
    #      annotate everything.  Use the assignAllIds function to give an automatic
    #      id to everything in the model.
    annotator.assignAllIds()

    #  6.e
    #      Try to retrieve duplicated ids from the annotator as in step 4.b, and
    #      check that it returns an empty list.
    duplicated_ids = annotator.duplicateIds()
    print('There are {} duplicated ids in the model.'.format(len(duplicated_ids)))