Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default representation of edge attributes from vectors to lists & other miscellaneous adjustments #274

Merged
merged 13 commits into from
Jan 14, 2025

Conversation

maxloeffler
Copy link
Contributor

@maxloeffler maxloeffler commented Dec 5, 2024

Prerequisites

  • I adhere to the coding conventions (described here) in my code.
  • I have updated the copyright headers of the files I have modified.
  • I have written appropriate commit messages, i.e., I have recorded the goal, the need, the needed changes, and the location of my code modifications for each commit. This includes also, e.g., referencing to relevant issues.
  • I have put signed-off tags in all commits.
  • I have updated the changelog file NEWS.md appropriately.
  • I have checked whether I need to adjust the showcase file showcase.R with respect to my changes.
  • The pull request is opened against the branch dev.

Description

  • Replace deprecated igraph functions by their more recent counterparts.
  • Handle reading empty commit interaction data explicitly, to avoid crashing in that case.
  • To make splitting networks by ranges more consistent with splitting time-based, also add range information to the return value.
  • Convert certain edge attributes of networks to list type. This change is motivated by a more stricter type enforcement in igraph version 2.0 and up. When merging networks through igraph::disjoint_union, edge attributes that exist in both networks are now assumed to have identical types. But, when we simplify a network, its edge attributes convert to list type as they now need to store data that a vector in R cannot store anymore, i.e., one edge attribute now has to hold several lists of data of all the edges that got merged into one. Therefore, we need to convert edge attributes to list by default to be able to merge simplified and unsimplified networks.

Changelog

Added

  • Introduce convert.edge.attributes.to.list function.

Fixed

  • Fix reading empty commit interaction data.

Changed / Improved

  • Replace deprecated igraph functions.
  • Add range information to the return value in split.network.time.based.by.ranges, to be consistent with split.data.time.based.
  • Convert edge attributes to a list format to ensure compatibility with igraph.

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 93.44262% with 4 lines in your changes missing coverage. Please review.

Project coverage is 81.09%. Comparing base (1d1fe7f) to head (0cc59bc).
Report is 15 commits behind head on dev.

Files with missing lines Patch % Lines
util-networks-metrics.R 0.00% 2 Missing ⚠️
util-networks.R 97.82% 1 Missing ⚠️
util-read.R 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #274      +/-   ##
==========================================
+ Coverage   80.95%   81.09%   +0.14%     
==========================================
  Files          16       16              
  Lines        5051     5095      +44     
==========================================
+ Hits         4089     4132      +43     
- Misses        962      963       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

maxloeffler and others added 3 commits December 7, 2024 17:06
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
The interaction reading breaks if an interaction is empty in the data.
This commit adds a check for empty interactions.

Signed-off-by: Christian Hechtl <hechtl@cs.uni-saarland.de>
Since version 2.0.0 of package `igraph`, the function `igraph::power_law_fit`
(and its deprecated alias `igraph::fit.power.law`) does not automatically
compute the p-value any more, which is needed by `metrics.scale.freeness`.
In version 2.1.1 of package `igraph`, they have added a new parameter
`p.value` to provide the possibility to enable p-value computation
again. In addition, they have added a parameter `p.precision` which
defaults to a precision of 0.01.

To make `metrics.scale.freeness` work again with the most recent version
of package `igraph`, the deprecated function calls to
`igraph::fit.power.law` are replaced by `igraph::power_law_fit` and are
adjusted to use the new parameter `p.value` to automatically compute the
p-value again. However, we do not make use of `p.precision` and rely on
the default precision instead.

Signed-off-by: Thomas Bock <bockthom@cs.uni-saarland.de>
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking over the rest of this PR, I only found a couple of minor issues, please find the individual comments below.

Regarding the uncovered lines reported by codecov: I suggest to ignore them. 2 out of 4 missing lines are part of an untested module anyway. The other two lines refer to very special cases that have been previously untested too.

This change makes splitting by ranges more consistent with time-based
splitting.

This works towards se-sic#273.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Since igraph version 2.1, when joining networks using
'igraph::disjoint_union', edge attributes of the joining networks
require identical types. As simplifiying networks necessarily converts
types of edge attributes to list when merging edges, attributes now have
to be of type list by default.

Edge attributes that are explicitly considered during simplification
and, therefore, are not converted to lists are excluded from this rule.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Adjust the tests in accordance to converting edge attributes to list
type in the implementation.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
This is necessary since igraph falsely fills in non-present edge
attributes with NULLs instead of NAs in certain cases when using
'igraph::disjoint_union' and 'igraph::add_edges'.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
'plyr::rbind.fill' uses NULL to fill missing values in lists. As we now
use lists for most edge attributes, we need to handle this case
separately to ensure missing values are filled with NAs instead.

To fix this issue, we need to instantiate missing columns in dataframes
with NAs before calling 'plyr::rbind.fill'. This operation is constant
with respect to the amount of rows and should not impact performance too
much.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
@maxloeffler maxloeffler changed the title Miscellaneous fixes and adjustments Change default representation of edge attributes from vectors to lists & other miscellaneous adjustments Dec 16, 2024
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a few comments regarding the news. Please find them below.

NEWS.md Outdated

### Fixed

- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d85043112971c04998249c14a0677a32c9004)
- Fix a bug in `extract.timestamps` that occurs when the first `data.source` contains empty data and that leads to a return value of type numeric which should be POSIXct (PR #270, 10696e4cf4ae92371917ed8ccaec2b0183da145c, 646c01a42ad8decfbc9040030e790e51cb65cffd)
- Fix `read.commit.interactions` by explicitly considering empty commit interaction data (PR #274, f591528a0f1f11b1a4390949ab770f3f74a766f9)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty commit interaction data ➡️ non-existent commit interactions

(for explanation: the commit-interaction data per se are present, but specific interactions between specific commits are empty, that is, non-existent)

@bockthom bockthom added this to the v5.0 milestone Dec 17, 2024
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
The igraph maintainers decided that they will not re-install consistency
with their older versions (in terms of filling in non-existent values in
'igraph::disjoint_union' and 'igraph::add_edges' with NAs) yet
(igraph/rigraph#1587). The previously
temporary fix is therefore now semi-permanent.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
@maxloeffler
Copy link
Contributor Author

Between the last and the second to last push I forgot to update the copyright headers of util-networks.R 😅.

Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found a few minor inconsistencies in the README, please find them below.
If also @hechtlC is fine with the new announcement, this PR is ready to be merged then.

bockthom
bockthom previously approved these changes Jan 13, 2025
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
@bockthom bockthom merged commit 58f66cf into se-sic:dev Jan 14, 2025
9 checks passed
@bockthom bockthom mentioned this pull request Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants