Preface
For a lot of JavaScript developers that are moving over from traditional “browser” development to node.js, they might be casually aware of the require keyword that node.js provides you. If you’re wondering what I’m talking about, allow me to explain. In node.js, you can use the node package manager (NPM) to install packages that your application can use. Express, for example, is a great framework for serving up web applications, and you can grab it via NPM.
To load Express into your application, you tell node.js that you “require” it be loading into memory:
- var express = require(“express”);
Where's this going
Let’s say you want to abstract out a piece of your application into another file:
- userRepository.js
- var userRepository = function (){
- var self = this;
- self.addUser = function(name, age){
- self.user = [name, age]
- };
- self.get = function (){
- return self.user
- }
- };
- //exports.userRepository = userRepository
- module.exports = userRepository
- main.js
- var userRepository = require("./userRepository.js");
- ur = new userRepository()
- ur.addUser('John', 37)
- console.log(ur.get())
More Details
Node uses two core modules for managing module dependencies:
You can think of the require module as the command and the module module as the organizer of all required modules. Requiring a module in Node isn’t that complicated of a concept:
- const config = require('/path/to/file');
I’ll attempt to explain with examples these different stages and how they affect the way we write modules in Node. Let me first create a directory to host all the examples using my terminal:
All the commands in the rest of this article will be run from within ~/learn-node.
Resolving a local path
Let me introduce you to the module object. You can check it out in a simple REPL session:
Every module object gets an id property to identify it. This id is usually the full path to the file, but in a REPL session it’s simply
When we require a 'find-me' module, without specifying a path:
- require('find-me');
The paths list is basically a list of node_modules directories under every directory from the current directory to the root directory. It also includes a few legacy directories whose use is not recommended. If Node can’t find find-me.js in any of these paths, it will throw a “cannot find module error.”
If you now create a local node_modules directory and put a find-me.js in there, the require('find-me') line will find it.
If another find-me.js file existed in any of the other paths, for example, if we have a node_modules directory under the home directory and we have a different find-me.js file in there:
When we require('find-me') from within the learn-node directory — which has its own node_modules/find-me.js, the find-me.js file under the home directory will not be loaded at all. If we remove the local node_modules directory under ~/learn-node and try to require find-me one more time, the file under the home’s node_modules directory would be used:
Requiring a folder
Modules don’t have to be files. We can also create a find-me folder under node_modules and place an index.js file in there. The same require('find-me') line will use that folder’s index.js file:
Note how it ignored the home directory’s node_modules path again since we have a local one now. An index.js file will be used by default when we require a folder, but we can control what file name to start with under the folder using the main property in package.json. For example, to make the require('find-me') line resolve to a different file under the find-me folder, all we need to do is add a package.json file in there and specify which file should be used to resolve this folder:
require.resolve
If you want to only resolve the module and not execute it, you can use the require.resolve function. This behaves exactly the same as the main require function, but does not load the file. It will still throw an error if the file does not exist and it will return the full path to the file when found.
This can be used, for example, to check whether an optional package is installed or not and only use it when it’s available.
Relative and absolute paths
Besides resolving modules from within the node_modules directories, we can also place the module anywhere we want and require it with either relative paths (./ and ../) or with absolute paths starting with /. If, for example, the find-me.jsfile was under a lib folder instead of the node_modules folder, we can require it with:
- require('./lib/find-me');
Create a lib/util.js file and add a console.log line there to identify it. Also, console.log the module object itself:
Do the same for an index.js file, which is what we’ll be executing with the node command. Make this index.js file require lib/util.js:
Now execute the index.js file with node:
Note how the main index module (id: '.') is now listed as the parent for the lib/util module. However, the lib/util module was not listed as a child of the index module. Instead, we have the [Circular] value there because this is a circular reference. If Node prints the lib/util module object, it will go into an infinite loop. That’s why it simply replaces the lib/util reference with [Circular].
More importantly now, what happens if the lib/util module required the main [i]index module? This is where we get into what’s known as the circular modular dependency, which is allowed in Node. To understand it better, let’s first understand a few other concepts on the module object.
exports, module.exports, and synchronous loading of modules
In any module, exports is a special object. If you’ve noticed above, every time we’ve printed a module object, it had an exports property which has been an empty object so far. We can add any attribute to this special exports object. For example, let’s export an id attribute for index.js and lib/util.js:
- // Add the following line at the top of lib/util.js
- exports.id = 'lib/util';
- // Add the following line at the top of index.js
- exports.id = 'index';
Note how the exports object now has the attributes we defined in each module. You can put as many attributes as you want on that exports object, and you can actually change the whole object to be something else. For example, to change the exports object to be a function instead of an object, we do the following:
- // Add the following line in index.js before the console.log
- module.exports = function() {};
We can’t actually do that because the exports variable inside each module is just a reference to module.exports which manages the exported properties. When we reassign the exports variable, that reference is lost and we would be introducing a new variable instead of changing the module.exports object.
The module.exports object in every module is what the require function returns when we require that module. For example, change the require('./lib/util') line in index.js into:
- const UTIL = require('./lib/util');
- console.log('UTIL:', UTIL);
Let’s also talk about the loaded attribute on every module. So far, every time we printed a module object, we saw a loaded attribute on that object with a value of false.
The module module uses the loaded attribute to track which modules have been loaded (true value) and which modules are still being loaded (false value). We can, for example, see the index.js module fully loaded if we print its module object on the next cycle of the event loop using a setImmediate call:
- // In index.js
- setImmediate(() => {
- console.log('The index.js module object is now loaded!', module)
- });
- The index.js module object is now loaded! Module {
- id: '.',
- exports: { id: 'index' },
- parent: null,
- filename: '/root/learn-node/index.js',
- loaded: true,
- children:
- [ Module {
- id: '/root/learn-node/lib/util.js',
- exports: [Object],
- parent: [Circular],
- filename: '/root/learn-node/lib/util.js',
- loaded: true,
- children: [],
- paths: [Object] } ],
- paths:
- [ '/root/learn-node/node_modules',
- '/root/node_modules',
- '/node_modules' ] }
- fs.readFile('/etc/passwd', (err, data) => {
- if (err) throw err;
- exports.data = data; // Will not work.
- });
Let’s now try to answer the important question about circular dependency in Node: What happens when module 1 requires module 2, and module 2 requires module 1? To find out, let’s create the following two files under lib/, module1.js and module2.js and have them require each other:
- // lib/module1.js
- exports.a = 1;
- require('./module2');
- exports.b = 2;
- exports.c = 3;
- // lib/module2.js
- const Module1 = require('./module1');
- console.log('Module1 is partially loaded here', Module1);
We required module2 before module1 was fully loaded, and since module2 required module1 while it wasn’t fully loaded, what we get from the exports object at that point are all the properties exported prior to the circular dependency. Only the a property was reported because both b and c were exported after module2 required and printed module1. Node keeps this really simple. During the loading of a module, it builds the exports object. You can require the module before it’s done loading and you’ll just get a partial exports object with whatever was defined so far.
JSON and C/C++ addons
We can natively require JSON files and C++ addon files with the require function. You don’t even need to specify a file extension to do so. If a file extension was not specified, the first thing Node will try to resolve is a .js file. If it can’t find a .js file, it will try a .json file and it will parse the .json file if found as a JSON text file. After that, it will try to find a binary .node file. However, to remove ambiguity, you should probably specify a file extension when requiring anything other than .js files.
Requiring JSON files is useful if, for example, everything you need to manage in that file is some static configuration values, or some values that you periodically read from an external source. For example, if we had the following config.json file:
- {
- "host": "localhost",
- "port": 8080
- }
We can require it directly like this:
If Node can’t find a .js or a .json file, it will look for a .node file and it would interpret the file as a compiled addon module. The Node documentation site has a sample addon file which is written in C++. It’s a simple module that exposes a hello() function and the hello function outputs “world.” You can use the node-gyp package to compile and build the .cc file into a .addon file. You just need to configure a binding.gyp file to tell node-gyp what to do.
Once you have the addon.node file (or whatever name you specify in binding.gyp) then you can natively require it just like any other module:
- const addon = require('./addon');
- console.log(addon.hello());
Looking at the functions for each extension, you can clearly see what Node will do with each. It uses module._compile for .js files, JSON.parse for .json files, and process.dlopen for .node files.
All code you write in Node will be wrapped in functions
Node’s wrapping of modules is often misunderstood. To understand it, let me remind you about the exports/module.exports relation. We can use the exports object to export properties, but we cannot replace the exports object directly because it’s just a reference to module.exports
- exports.id = 42; // This is ok.
- exports = { id: 42 }; // This will not work.
- module.exports = { id: 42 }; // This is ok.
- var answer = 42;
The answer is simple. Before compiling a module, Node wraps the module code in a function, which we can inspect using the wrapper property of the module module:
Node does not execute any code you write in a file directly. It executes this wrapper function which will have your code in its body. This is what keeps the top-level variables that are defined in any module scoped to that module. This wrapper function has 5 arguments: exports, require, module, __filename, and __dirname. This is what makes them appear to look global when in fact they are specific to each module.
All of these arguments get their values when Node executes the wrapper function. exports is defined as a reference to module.exports prior to that. require and module are both specific to the function to be executed, and __filename/__dirname variables will contain the wrapped module’s absolute filename and directory path. You can see this wrapping in action if you run a script with a problem on its first line:
Note how the first line of the script as reported above was the wrapper function, not the bad reference. Moreover, since every module gets wrapped in a function, we can actually access that function’s arguments with the argumentskeyword:
# echo "console.log(arguments)" > index.js
# node index.js
- { '0': {},
- '1':
- { [Function: require]
- resolve: [Function: resolve],
- main:
- Module {
- id: '.',
- exports: {},
- parent: null,
- filename: '/root/learn-node/index.js',
- loaded: false,
- children: [],
- paths: [Object] },
- extensions: { '.js': [Function], '.json': [Function], '.node': [Function] },
- cache: { '/root/learn-node/index.js': [Object] } },
- '2':
- Module {
- id: '.',
- exports: {},
- parent: null,
- filename: '/root/learn-node/index.js',
- loaded: false,
- children: [],
- paths:
- [ '/root/learn-node/node_modules',
- '/root/node_modules',
- '/node_modules' ] },
- '3': '/root/learn-node/index.js',
- '4': '/root/learn-node' }
The wrapping function’s return value is module.exports. Inside the wrapped function, we can use the exports object to change the properties of module.exports, but we can’t reassign exports itself because it’s just a reference. What happens is roughly equivalent to:
- function (require, module, __filename, __dirname) {
- let exports = module.exports;
- // Your Code...
- return module.exports;
- }
The require object
There is nothing special about require. It’s an object that acts mainly as a function that takes a module name or path and returns the module.exports object. We can simply override the require object with our own logic if we want to. For example, maybe for testing purposes, we want every require call to be mocked by default and just return a fake object instead of the required module exports object. This simple reassignment of require will do the trick:
- require = function() {
- return { mocked: true };
- }
The require object also has properties of its own. We’ve seen the resolve property, which is a function that performs only the resolving step of the require process. We’ve also seen require.extensions above. There is also require.mainwhich can be helpful to determine if the script is being required or run directly.
Say, for example, that we have this simple printInFrame function in print-in-frame.js:
- print-in-frame.js
- const printInFrame = (size, header) => {
- console.log('*'.repeat(size));
- console.log(header);
- console.log('*'.repeat(size));
- };
1. From the command line directly like this:
Passing 8 and Hello as command line arguments to print “Hello” in a frame of 8 stars.
2. With require. Assuming the required module will export the printInFrame function and we can just call it:
- const print = require('./print-in-frame');
- print(5, 'Hey');
Those are two different usages. We need a way to determine if the file is being run as a stand-alone script or if it is being required by other scripts. This is where we can use this simple if statement:
- if (require.main === module) {
- // The file is being executed directly (not with require)
- }
- const printInFrame = (size, header) => {
- console.log('*'.repeat(size));
- console.log(header);
- console.log('*'.repeat(size));
- };
- if (require.main === module) {
- printInFrame(process.argv[2], process.argv[3]);
- } else {
- module.exports = printInFrame;
- }
All modules will be cached
Caching is important to understand. Let me use a simple example to demonstrate it. Say that you have the following ascii-art.js file that prints a cool looking header:
We want to display this header every time we require the file. So when we require the file twice, we want the header to show up twice:
- require('./ascii-art') // will show the header.
- require('./ascii-art') // will not show the header.
We can see this cache by printing require.cache after the first require. The cache registry is simply an object that has a property for every required module. Those properties values are the module objects used for each module. We can simply delete a property from that require.cache object to invalidate that cache. If we do that, Node will re-load the module to re-cache it.
However, this is not the most efficient solution for this case. The simple solution is to wrap the log line in ascii-art.js with a function and export that function. This way, when we require the ascii-art.js file, we get a function that we can execute to invoke the log line every time:
- require('./ascii-art')() // will show the header.
- require('./ascii-art')() // will also show the header.
* Requiring modules in Node.js: Everything you need to know
* Node.js document - Modules
* Node.js Beyond the Basics Paperback – March 1, 2018
* Node.js: Using require to load your own file
沒有留言:
張貼留言