Navigating through dependency hell (for dummies)

Navigating through dependency hell (for dummies)

Hello there!

Recently we all got introduced to Log4j's security vulnerability that poses a risk to thousands of applications depending directly or indirectly on the library. It might seemingly come off as a small matter of version upgrade but for large tech organizations running hundreds of applications involving multiple components and dependencies, it can become a cumbersome task to pinpoint and address these issues properly unless you know your way around dependency management. Hence, I thought of writing a short blog to share a few of my learnings till now as working with dependency resolution issues usually feels quite overwhelming for junior Software Engineers as per my experience.

First thing first, A dependency/library is just some external piece of code that you might want to use in your project for convenience or requirement. That was simple, but you also need to be aware that dependencies can be direct or transitive. Also, I will be calling dependency as dep from now on.

Direct: You need to use a dep directly in your project and you know exactly the name and version of it.

Transitive: You need to use a dep for your project and that dep internally depends on multiple other deps which will also get included transitively in your project.

Most useful libraries will have their own transitive deps and that's great because they can manage their internal dep management without you having to bother about it. Well, that's great!, why should I care about it then?

Because it will quickly become a black box once you have multiple such deps in your project and you don't have any control over them, yet. What if two such different deps transitively depend upon a common dep but with different versions? Well, congrats! now you have two versions of the same class in your project's classpath. Your code might even compile properly but just you wait till you try to run it and be greeted with runtime issues such as ClassNotFoundException, NoClassDefFoundError & NoSuchMethodError.

If you are working with Maven, it will try to resolve the nearest dependency version, read here : https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#transitive-dependencies. However, this is not what you would want ideally. The first step you should be taking is to look up the dependency tree for your project with the command :

mvn dependency:tree

The output should give you an overview of the entire dep structure, from here you can figure out the direct deps which include a common dep transitively but with varying versions. Let's take this example where B and E depend on D.

  A
  ├── B
  │   └── C
  │       └── D 2.0
  └── E
      └── D 1.0

At this point, you can choose to exclude D from either B or E based on the criteria whether the functionality provided by D is not required in that particular module. Let's say we want to exclude D 1.0 from E as we don't want to use any of E's functionality depending upon D 1.0. You would do it with the maven exclusion tag :

<project>
  ...
  <dependencies>
    <dependency>
      <groupId>SampleGroupE</groupId>
      <artifactId>E</artifactId>
      <version>latest</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <groupId>SampleGroupD</groupId>
          <artifactId>D</artifactId>
        </exclusion>
      </exclusions> 
    </dependency>
  </dependencies>
</project>

This is good, but we can do even better in cases where your project relies directly on D. For example dep D has some utility class that you directly use as part of your code, in such a scenario instead of relying on transitive deps we can override the intended version of D by including it as a direct dependency, this will take precedence over the transitive versions. Our dep tree shall look like this here :

  A
  ├── B
  │   └── C
  │       └── D 2.0
  ├── E
  │   └── D 1.0
  │
  └── D 2.0

A good example of dependency exclusion is Spring Boot, where you can choose to exclude the default embedded web server Tomcat and go with something like a jetty. Read here : https://docs.spring.io/spring-boot/docs/current/reference/html/howto.html#howto.webserver

Dependency trees can also help in avoiding false positives. For instance, in the case of Log4j vulnerability, only the dep log4j-core is a threat. In the case of my team's applications at PayPal, we were only depending upon log4j-to-slf4j and log4j-api coming in transitively via the Spring Boot starter libs, which required no immediate action. A dep tree check can help you quickly understand if you are at risk as you can look for the vulnerable dep's existence. Read here : https://spring.io/blog/2021/12/10/log4j2-vulnerability-and-spring-boot

A few personal tips :

  1. If I am facing runtime issues due to a particular Class and am not sure where is it coming from, I place debug points at the Class's invocation points and also within all versions of the class file in my classpath. When the debug point hits I can exactly locate the used dep and also compare the code within different versions in case of NoSuchMethodError.

  2. If you are using particular libraries only for Unit or Functional tests mark the dep scope as test, this makes sure you don't have unnecessary transitive deps in your App's runtime classpath which will cause issues. I have seen this so many times with TestNG and RESTEasy being used together.

  3. Standardize dependency management by enforcing particular rules. You can use Maven's enforcer plugin for this : https://maven.apache.org/enforcer/enforcer-rules/index.html. There are plenty of built-in rules which can do basic checks such as duplication, multiple versions, etc. You can always have custom implementations for your needs as well.

Well, that's it for now, hope this will come in handy for you if you are struggling with dependency resolutions.