Skip to content

[WIP][Cling] Refactor CIFactory and add incremental action support#21903

Draft
SahilPatidar wants to merge 2 commits into
root-project:masterfrom
SahilPatidar:incr_action
Draft

[WIP][Cling] Refactor CIFactory and add incremental action support#21903
SahilPatidar wants to merge 2 commits into
root-project:masterfrom
SahilPatidar:incr_action

Conversation

@SahilPatidar
Copy link
Copy Markdown

This Pull request:

This PR refactors CIFactory and adds support for IncrementalAction (from Clang-Repl). The goal is to reduce the amount of Clang-specific logic handled inside CIFactory and rely more on Clang’s existing infrastructure.

Changes or fixes:

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #

@devajithvs devajithvs self-assigned this Apr 13, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 13, 2026

Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 3d5a177.

♻️ This comment has been updated with latest results.

@SahilPatidar
Copy link
Copy Markdown
Author

@vgvassilev
Most of the failing tests are related to Clad:

TEST FAILURES:
1:cppinterop-CppInterOpTests
185:gtest-hist-hist-TFormulaGradientTests
186:gtest-hist-hist-TFormulaHessianTests
271:gtest-math-mathcore-CladDerivatorTests
309:minuit2_testADMinim
330:gtest-roofit-histfactory-testHistFactory
350:gtest-roofit-roofitcore-testRooFuncWrapper
353:gtest-roofit-roofitcore-testRooMinimizer
361:test-stressroofit-codegen
426:tmva-sofie-test-TestCladAutodiff
430:gtest-tmva-sofie-TestGemmDerivative
922:tutorial-math-fit-exampleFit3D
945:tutorial-math-fit-minuit2GausFit

Some of them show errors like:

error: static assertion failed: clad doesn't appear to be loaded; make sure that you pass clad.so to clang.
        static_assert(false, "clad doesn't appear to be loaded; make sure that "
                      ^~~~~

This may be happening because I made rough changes in FrontendAction::CreateWrappedASTConsumer, where it now directly returns the DeclCollector from IncrementalAction::CreateASTConsumer without adding the plugin consumer.

Previously, I mentioned an issue related to the Clad plugin consumer. There was a problem with CladPlugin::initializeSema adding a SemaExternalSource, which I have resolved.

If we allow the plugin consumer, then during BeginSourceFile, when setASTConsumer is called, it triggers Initialize. This ends up stealing and replacing the CompilerInstance’s consumers. I’m not fully sure why this happens, but it appears to be causing many of the test failures.

In the original Cling implementation, as far as I can see, we never call Plugin::Initialize (so no replacement happens). Instead, we take the plugin consumer and pass it to the DeclCollector consumer.

We need to find a way (to pass plugin into DeclCollector or something else) to solve this first before moving forward.

@vgvassilev
Copy link
Copy Markdown
Member

When clad is OFF we should not run these tests. This means that we need to suppress them in cmake. cc: @guitargeek

Are the related llvm/clang changes really required? I think we should do extra effort to avoid them..

@SahilPatidar
Copy link
Copy Markdown
Author

Those are in CodeGenAction.cpp. On my local system, they were causing a segmentation fault. I’m not sure why, but after commenting them out, the crash stopped:

// PrettyStackTraceDecl CrashInfo(*D.begin(), SourceLocation(),
//                                Context->getSourceManager(),
//                                "LLVM IR generation of declaration");

And another change related to CompilerInstance and SourceManager initialization. As I mentioned, in Cling we use one large virtual source file to keep source locations consistent.

For now, I added a temporary helper:

runIncrementalSourceMgrInitializer

@vgvassilev
Copy link
Copy Markdown
Member

Those are in CodeGenAction.cpp. On my local system, they were causing a segmentation fault. I’m not sure why, but after commenting them out, the crash stopped:

// PrettyStackTraceDecl CrashInfo(*D.begin(), SourceLocation(),
//                                Context->getSourceManager(),
//                                "LLVM IR generation of declaration");

And another change related to CompilerInstance and SourceManager initialization. As I mentioned, in Cling we use one large virtual source file to keep source locations consistent.

For now, I added a temporary helper:

runIncrementalSourceMgrInitializer

Sounds suspicious. Maybe we should run valgrind.

@SahilPatidar
Copy link
Copy Markdown
Author

We need to address these issues in the current setup:

  • Based on the original Cling setup, CladPlugin should be passed to DeclCollector, not to CompilerInstance.
  • initialization of the SourceManager to avoid clang side work.

For SourceManager related:
Currently, we use one large virtual file as the main file to generate unique source locations. We create this using custom logic during SourceManager initialization. However, clang::FrontendAction::BeginSourceFile does not provide a way to do this without modifying its behavior.

So, I tried overriding the MainFileID inside clang::FrontendAction::BeginSourceFileAction. I moved the source initialization code there. It builds fine, and I did not notice any other test failures compare to previous version.

I also tried another approach (just to test how it behaves). I let BeginSourceFile initialize the SourceManager as usual, and then I created a new virtual file. This file uses the end location of the main file when creating its FileID, so it acts like an includer instead of the main file.

const char* Filename = "<<< includer >>>";
FileEntryRef FE = FM.getVirtualFileRef(Filename, 1U << 15U, time(0));

SM.setFileIsTransient(FE);

SourceLocation Result = SM.getLocForEndOfFile(SM.getMainFileID());

m_VirtualFileID =
    SM.createFileID(FE, Result, SrcMgr::C_User);

auto Buffer = llvm::MemoryBuffer::getMemBufferCopy("/*CLING DEFAULT MEMBUF*/;\n");

SM.overrideFileContents(FE, std::move(Buffer));

After this, I did not see any issue while building. But when I ran roottest-root-tree, a few tests started failing. One of them failed with this error:

3484: /Users/sahilpatidar/Desktop/root/roottest/root/treeformula/stl/runpair.C:4:11: error: redefinition of 'file' as different kind of symbol
3484:    TFile *file = TFile::Open("RefTest.root");
3484:           ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/sys/kernel_types.h:54:8: note: previous definition is here
3484: struct file;

This error happens inside CheckVariableDeclaration:

bool DeclExtractor::CheckForClashingNames(
    const llvm::SmallVector<NamedDecl*, 4>& Decls,
    DeclContext* DC) {

  for (size_t i = 0; i < Decls.size(); ++i) {
    ...
    else if (VarDecl* VD = dyn_cast<VarDecl>(ND)) {
      LookupResult Previous(*m_Sema, ND->getDeclName(), ND->getLocation(),
                            Sema::LookupOrdinaryName,
                            RedeclarationKind::ForVisibleRedeclaration);
      m_Sema->LookupQualifiedName(Previous, DC);
      m_Sema->CheckVariableDeclaration(VD, Previous);
      ...

I am not sure why this is happening, and I am also not sure any of these way is correct? Can we do something better?

@vgvassilev
Copy link
Copy Markdown
Member

Can you debug side by side with the working version?

@SahilPatidar
Copy link
Copy Markdown
Author

I found the issue.

The second approach (using a different virtual file instead of the main file) was failing because of a check in DefinitionShadower::Transform. This code decides whether to put declarations inside the __cling_N5 namespace to avoid name conflicts.

It uses typedInClingPrompt for this check. If it returns true, the declaration is wrapped inside the namespace.

In my case, since I used another virtual file instead of the main file, this function returned false. Because of that, the declaration was not added to the namespace, which caused the name collision error.

static bool typedInClingPrompt(FullSourceLoc L) {
  if (L.isInvalid())
    return false;

  const SourceManager &SM = L.getManager();
  const FileID FID = SM.getFileID(L);

  return SM.isFileOverridden(SM.getFileEntryForID(FID)) &&
         (SM.getFileID(SM.getIncludeLoc(FID)) == SM.getMainFileID());
}

@devajithvs
Copy link
Copy Markdown
Contributor

In my case, since I used another virtual file instead of the main file, this function returned false. Because of that, the declaration was not added to the namespace, which caused the name collision error.

Maybe we can add a check to see if the buffer name starts with input_line

Something similar to what we have here:

assert(FE->getName().contains("input_line_"));

@SahilPatidar
Copy link
Copy Markdown
Author

I tried it, and now the tests match the working version on my machine. If second approach is correct to do in Cling, then the only remaining issue is the Clad plugin.

@vgvassilev
Copy link
Copy Markdown
Member

For Clad, here is where we steal the consumers and tune them: https://github.com/vgvassilev/clad/blob/db7f52fafcb45bf8cbaa8b0c7256c664c553e940/tools/ClangPlugin.cpp#L154 and here https://github.com/vgvassilev/clad/blob/db7f52fafcb45bf8cbaa8b0c7256c664c553e940/tools/ClangPlugin.h#L173

I suspect some of these assumptions are incorrect after your patch.

@SahilPatidar
Copy link
Copy Markdown
Author

From the code, I see that in the original setup we never called Initialize for the Clad consumer. I also checked in the debugger and confirmed we never did any “stealing” of the consumer.

In the original version, we just passed the plugin to the DeclCollector consumer here:
https://github.com/vgvassilev/root/blob/7c320ae5313332784afd0727e95f1e9cb99ac44b/interpreter/cling/lib/Interpreter/IncrementalParser.cpp#L339

In the new version, we pass the plugin consumer in CI during FrontendAction::BeginSourceFile, and setASTConsumer calls Initialize, which does the "stealing.” This is the behavior difference I see.

That’s why I directly return the consumer from FrontendAction::CreateWrappedConsumer without adding the Clad consumer for now, which causes Clad-related test failures. If I allow the Clad consumer to be passed in CI, other many tests fail.

@vgvassilev
Copy link
Copy Markdown
Member

vgvassilev commented Apr 25, 2026

From the code, I see that in the original setup we never called Initialize for the Clad consumer. I also checked in the debugger and confirmed we never did any “stealing” of the consumer.

In the original version, we just passed the plugin to the DeclCollector consumer here: https://github.com/vgvassilev/root/blob/7c320ae5313332784afd0727e95f1e9cb99ac44b/interpreter/cling/lib/Interpreter/IncrementalParser.cpp#L339

In the new version, we pass the plugin consumer in CI during FrontendAction::BeginSourceFile, and setASTConsumer calls Initialize, which does the "stealing.” This is the behavior difference I see.

That’s why I directly return the consumer from FrontendAction::CreateWrappedConsumer without adding the Clad consumer for now, which causes Clad-related test failures. If I allow the Clad consumer to be passed in CI, other many tests fail.

Ok, we will need to think how to circumvent the stealing on clad’s side. Can you think of a way?

EDIT: Maybe we steal only when the preprocessor has no CLING macro defined? Or even if we don’t have incremental extensions defined.

@SahilPatidar
Copy link
Copy Markdown
Author

I need to understand the behavior when we make changes on the Clad side. How can I modify the Clad plugin and recompile it on root side?

@vgvassilev
Copy link
Copy Markdown
Member

I need to understand the behavior when we make changes on the Clad side. How can I modify the Clad plugin and recompile it on root side?

Not easy. One is edit the source in the build folder clad-prefix, then go to the build directory clad-prefix/…/build and do “make install”. Then cd to the top-most root_build and call “make Cling”. That should be all you need.

@SahilPatidar
Copy link
Copy Markdown
Author

I made a few changes to check if this works.

First, I commented out Clad::ClangPlugin::Initialize to stop it from taking the ASTConsumer from CI.

Second, I changed this and updated Clad::ClangPlugin::Initialize (even though it is commented out). This function expects clad;plugin at the end of the multiplex, but I moved it to the beginning:

PluginASTAction::ActionType getActionType() override {
  // return AddAfterMainAction;
  return AddBeforeMainAction;
}

In the original setup, we add clad;plugin before the codegen consumer, so I made this change to keep that order.

Third, I updated ParseOrWrapTopLevelDecl to call cladplugin, since it is in CI and not in DeclCollector:

// if (ADecl && !m_Consumer->HandleTopLevelDecl(ADecl.get())) {
if (ADecl && !S.getASTConsumer().HandleTopLevelDecl(ADecl.get())) {

After these changes, the tests that were failing on Ubuntu build are now passing.

I want to understand how to avoid initialization (so it does not take the consumer), and when we should do that.

I don’t see any other test failures locally after these changes:

roottest-root-tree
roottest-root-hist
roottest-root-math
io-stdarray
roottest-cling

@SahilPatidar
Copy link
Copy Markdown
Author

Clad-side changes - SahilPatidar/clad@f6ad58d.

@vgvassilev
Copy link
Copy Markdown
Member

Clad-side changes - SahilPatidar/clad@f6ad58d.

Does the clad test suite pass?

@SahilPatidar
Copy link
Copy Markdown
Author

I didn’t check on the Clad side. I’ll open a draft PR to see whether it passes Clad-CI.

@SahilPatidar
Copy link
Copy Markdown
Author

Currently, we are failing with this kind of error:

In file included from <<< cling interactive line includer >>>:2:
/Users/sftnight/ROOT-CI/build/include/ROOT/RFloat16.hxx:11:10: fatal error: module 'std' is needed but has not been provided, and implicit use of module files is disabled
#include <cstdint>
         ^
Error: Parsing #pragma failed /Users/sftnight/ROOT-CI/src/core/base/inc/LinkDef.h

The build works locally, so it looks like some flag is changing the build behavior in CI, and our changes are triggering it.

While building locally, I noticed this flag might be causing the different behavior after our changes:
-DCMAKE_CXX_STANDARD=23
Any idea what might be causing this?

@devajithvs devajithvs added the clean build Ask CI to do non-incremental build on PR label May 22, 2026
@devajithvs
Copy link
Copy Markdown
Contributor

Currently, we are failing with this kind of error:

In file included from <<< cling interactive line includer >>>:2:
/Users/sftnight/ROOT-CI/build/include/ROOT/RFloat16.hxx:11:10: fatal error: module 'std' is needed but has not been provided, and implicit use of module files is disabled
#include <cstdint>
         ^
Error: Parsing #pragma failed /Users/sftnight/ROOT-CI/src/core/base/inc/LinkDef.h

The build works locally, so it looks like some flag is changing the build behavior in CI, and our changes are triggering it.

That's because the CI was compiling incrementally. I've added a flag to clean build. Can you rebase and push again to trigger a rebuild?

@SahilPatidar
Copy link
Copy Markdown
Author

cmake -Droottest=On \
      -G Ninja \
      -Dtesting=On \
      -DCMAKE_BUILD_TYPE=Debug \
      -DLLVM_ENABLE_ASSERTIONS=On \
      ../

I mean the build succeeds locally with the command above, but I get the same error if I add these flags:

cmake -Droottest=On \
      -G Ninja \
      -Dtesting=On \
      -DCMAKE_BUILD_TYPE=Release \
      -DLLVM_ENABLE_ASSERTIONS=On \
      -Dlibcxx=OFF -Dmathmore=ON -DCMAKE_CXX_STANDARD=23 ../

Currently, I am trying to build with the following configuration. If this works, then it seems that the CMAKE_CXX_STANDARD=23 flag is causing some issue with the new changes:

cmake -S ../ -B . -G Ninja  -DCMAKE_BUILD_TYPE=RelWithDebInfo "-DCMAKE_CXX_STANDARD=17" "-DLLVM_ENABLE_ASSERTIONS=ON" "-Dall=OFF" "-Darrow=OFF" "-Dasan=OFF" "-Dasimage=ON" "-Dasserts=ON" "-Dbuiltin_cfitsio=ON" "-Dbuiltin_clang=ON" "-Dbuiltin_cling=ON" "-Dbuiltin_fftw3=ON" "-Dbuiltin_freetype=ON" "-Dbuiltin_ftgl=ON" "-Dbuiltin_gl2ps=ON" "-Dbuiltin_gsl=ON" "-Dbuiltin_gtest=ON" "-Dbuiltin_llvm=ON" "-Dbuiltin_lz4=ON" "-Dbuiltin_lzma=ON" "-Dbuiltin_nlohmannjson=ON" "-Dbuiltin_openssl=ON" "-Dbuiltin_openui5=ON" "-Dbuiltin_pcre=ON" "-Dbuiltin_tbb=ON" "-Dbuiltin_tiff=ON" "-Dbuiltin_unuran=ON" "-Dbuiltin_vdt=ON" "-Dbuiltin_xrootd=ON" "-Dbuiltin_xxhash=ON" "-Dbuiltin_zlib=OFF" "-Dbuiltin_zstd=ON" "-Dccache=ON" "-Dcefweb=OFF" "-Dcheck_connection=ON" "-Dclad=ON" "-Dclingtest=OFF" "-Dcocoa=ON" "-Dcoverage=OFF" "-Dcuda=OFF" "-Dcurl=ON" "-Ddaos=OFF" "-Ddataframe=ON" "-Ddavix=OFF" "-Ddcache=OFF" "-Ddev=OFF" "-Ddistcc=OFF" "-Dfail-on-missing=OFF" "-Dfcgi=OFF" "-Dfftw3=ON" "-Dfitsio=ON" "-Dgdml=ON" "-Dgminimal=OFF" "-Dgnuinstall=OFF" "-Dgsl_shared=OFF" "-Dgviz=OFF" "-Dhttp=ON" "-Dimt=ON" "-Dlibcxx=OFF" "-Dmacos_native=OFF" "-Dmathmore=ON" "-Dmemory_termination=OFF" "-Dminimal=OFF" "-Dminuit2_omp=OFF" "-Dmpi=OFF" "-Dopengl=ON" "-Dpyroot=ON" "-Dpythia8=OFF" "-Dqt6web=OFF" "-Dr=OFF" "-Droofit=ON" "-Droofit_multiprocess=OFF" "-Droot7=ON" "-Drootbench=OFF" "-Droottest=ON" "-Druntime_cxxmodules=ON" "-Dshadowpw=OFF" "-Dshared=ON" "-Dsoversion=OFF" "-Dspectrum=ON" "-Dsqlite=ON" "-Dssl=ON" "-Dtest_distrdf_dask=OFF" "-Dtest_distrdf_pyspark=OFF" "-Dtesting=ON" "-Dtmva-cpu=ON" "-Dtmva-cudnn=OFF" "-Dtmva-gpu=OFF" "-Dtmva-pymva=OFF" "-Dtmva-rmva=OFF" "-Dtmva-sofie=OFF" "-Dtmva=ON" "-Dunfold=ON" "-Dunuran=ON" "-During=OFF" "-Dvdt=ON" "-Dvecgeom=OFF" "-Dwebgui=ON" "-Dwin_broken_tests=OFF" "-Dwinrtdebug=OFF" "-Dx11=OFF" "-Dxml=ON" "-Dxrootd=ON" -DROOT_COMPILEDATA_IGNORE_BUILD_NODE_CHANGES=ON

@SahilPatidar
Copy link
Copy Markdown
Author

I fixed the issue. It now builds successfully locally with the same configuration. Next, I’ll run ctest.

After that, I want to know how to test the Clad-related changes on CI. Can I modify the CMake file in the PR to point to my Clad branch? I mean, just changing this in the PR:

list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/vgvassilev/clad.git)
list(APPEND _clad_extra_settings GIT_TAG v2.2)

@vgvassilev
Copy link
Copy Markdown
Member

I fixed the issue. It now builds successfully locally with the same configuration. Next, I’ll run ctest.

After that, I want to know how to test the Clad-related changes on CI. Can I modify the CMake file in the PR to point to my Clad branch? I mean, just changing this in the PR:

list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/vgvassilev/clad.git)
list(APPEND _clad_extra_settings GIT_TAG v2.2)

Yes, exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean build Ask CI to do non-incremental build on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants