Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyzing JS code with annotations #32

Open
franktip opened this issue Jun 3, 2020 · 11 comments
Open

analyzing JS code with annotations #32

franktip opened this issue Jun 3, 2020 · 11 comments
Labels
CLI

Comments

@franktip
Copy link

@franktip franktip commented Jun 3, 2020

I've been trying to analyze some code from the lumo project. Some of the code contains annotations, see e.g.

https://github.com/anmonteiro/lumo/blob/master/src/js/util.js

which contains code like:

export function expandPath(somePath: string): string {
const tildeExpandedPath = somePath.startsWith('')
? somePath.replace(/^
/, os.homedir())
: somePath;
return path.resolve(tildeExpandedPath);
}

I have no problem running queries against this project, but when I try to create a test that analyze some code fragments from this project, the extractor fails with a fatal error:

Could not extract a dataset in /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo: Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1
Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1
[1/1] FAILED(EXTRACTION) /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref
0 tests passed; 1 tests failed:
FAILED: /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref

I have a few questions:

  • is there any way to inform the codeql test command that we're analyzing code with annotations?
  • can a better error message be produced?
  • strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)
@hmakholm
Copy link
Contributor

@hmakholm hmakholm commented Jun 3, 2020

@asgerf
Copy link

@asgerf asgerf commented Jun 3, 2020

is there any way to inform the codeql test command that we're analyzing code with annotations?

Yes, add a file named options containing:

semmle-extractor-options: --experimental

There's an unfortunate discrepancy between how codeql test and codeql database create builds databases for JavaScript -- the former uses a legacy extractor interface which we're in the process of phasing out.

can a better error message be produced?

@hmakholm shouldn't codeql test forward the error from the extractor? I'm getting this error with qltest:

[2020-06-03 21:57:13] [ERROR] Spawned process exited abnormally (code 1; tried to run: [HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java, -jar, HOME/target/intree/standard/tools/extractor-javascript.jar, --quiet, --abort-on-parse-errors, --typescript, regression-test1.js])
[1/1] Extraction failed in HOME/ql/javascript/ql/test/library-tests/Regression (extraction: 614ms)
Unexpected token: 8:35
Invocation failed (exit code 1): HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java -jar HOME/target/intree/standard/tools/extractor-javascript.jar --quiet --abort-on-parse-errors --typescript regression-test1.js

strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)

They use Flow syntax, which is different from TypeScript and doesn't have its own extension.

@hmakholm
Copy link
Contributor

@hmakholm hmakholm commented Jun 3, 2020

shouldn't codeql test forward the error from the extractor?

It should, but unfortunately that's not implemented yet. The internal issue is github/codeql-coreql-team#333 -- which might receive greater priority now that there's an external complaint about it :-)

@franktip
Copy link
Author

@franktip franktip commented Jun 4, 2020

Thanks -- creating the options file works for me.

I have a follow-up question. First, some context: I created a test that contains some flow annotations. Then, in the directory containing the test, I ran "npm i flow-remove-types -SD" to install the annotation-remover --- this has the effect of installing many packages in a local node_modules subdirectory. Once that's installed, I can run "npm run flow:build" to strip away the annotations from the code, and the resulting stripped project is placed in a newly created subdirectory "lib". I can then run the JS code that is in this lib directory.

Now, when I run my query, I have two problems:

  • the analysis becomes extremely slow, presumably because the extractor is finding all the code that was installed in the node_modules directory. However, this code is not relevant to my own project -- it's only there for the flow tools.
  • if I remove the node_modules directory before running my analysis, my query runs quickly, but now I'm getting two sets of results: one for the annotated code in the "src" directory, and one for the stripped code in the lib directory.

So my question is: is there any way to inform "codeql test run" that it should only extract the code in the "src" directory?

@asgerf
Copy link

@asgerf asgerf commented Jun 4, 2020

Try appending --exclude lib --exclude node_modules to the line in the options file:

semmle-extractor-options: --experimental --exclude lib --exclude node_modules
@franktip
Copy link
Author

@franktip franktip commented Jun 4, 2020

Thanks! Are these options documented anywhere? (I was looking, but perhaps not in the right place)

@franktip
Copy link
Author

@franktip franktip commented Jun 4, 2020

Hmm, the --exclude lib --exclude node_modules options doesn't seem to make any difference. The query is still extremely slow..

@asgerf
Copy link

@asgerf asgerf commented Jun 5, 2020

Hm, try ./node_modules/** and ./lib/** instead. Sorry, as I mention this extractor interface is being phased out and I don't think we have proper documentation for it. The best reference is this file.

@franktip
Copy link
Author

@franktip franktip commented Jun 5, 2020

Hi Asger, unfortunately this does not seem to work for me either. Any other suggestions?

@max-schaefer
Copy link

@max-schaefer max-schaefer commented Jun 11, 2020

So, looking into this a bit, the reason that @asgerf's suggestion does not work is that (for historical reasons) codeql test extracts each .js file on its own (which is also one of the reasons it is slow), and when a file is explicitly passed to the extractor, --exclude flags are ignored.

There is a somewhat silly workaround: create a trivial tsconfig.json file containing only {}. I'm not going to explain why that works, but if you do that and then create an options file looking like this

semmle-extractor-options: --experimental --exclude lib/* --exclude node_modules/*

then lib and node_modules should be excluded.

@franktip
Copy link
Author

@franktip franktip commented Jun 11, 2020

Thanks, Max! I confirm that this works for me.

-Frank

@adityasharad adityasharad added the CLI label Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.