The VM implementation depends a lot on language semantics
That is why it is almost every language has its own VM.
function add(a, b) {
return a + b
}
add r1 a1 a2
ret r1
C++ is statically type, the value in the registers can have a single C++ type
Every problem can be solved by adding another level of indirection.
…, except too many indirections.
C++ way for type erasure
Base class with virtual methods
struct Value {
virtual ValuePtr Add(ValuePtr rhs) const = 0;
// ...
}
struct NumberValue : Value {
double m_Number;
}
sizeof(NumberValue)
- 16 B - 100% overhead
cache miss on every access to the number
virtual call for adding two numbers
It is so inefficient, that almost nobody does it this way
struct Value {
bool m_IsDouble;
union {
double as_Double;
void* as_Pointer;
} m_Value;
}
A double has 64-bits, from least to most significant:
Value = Mantissa * 2 ^ (exponent - 1023)
Not all possible values are actually valid numbers.
double r = sqrt(-1);
NaN - Not a Number
For x86_64 / arm64 the addresses above 2^48 are reserved and can’t be used from user applications
enum ValueType {
Number = 0,
Null = 1,
Undefined = 2,
Boolean = 3,
String = 4,
Array = 5,
Object = 6,
Function= 7,
};
Need to split easily double
to bits - use union to share the memory between a
double
and structs with bitfields.
union {
double as_double;
NanPointer as_pointer;
CheckType to_check;
} m_value;
struct NanPointer
{
// order from least to most significant bits
// bit field - has type size_t, but only 48 bits
size_t pointer:48;
size_t tag:3; // 0 for a real nan and non zero for out type
size_t nan:13; // must be all 1s to force a nan
};
// Used to check whether the value is a double or not
struct CheckType
{
size_t payload: 48;
size_t check:16;
};
bool is_double() const
{
// If any of the least significant bits is non-zero,
// then this is not a normal double or a NaN value
return m_value.to_check.check <= 0xFFF8;
}
double get_double() const
{
assert(is_double());
return m_value.as_double;
}
int get_tag() {
assert(!is_double());
return m_value.as_pointer.tag;
}
void* get_pointer() const {
return (void*)(size_t) m_value.as_pointer.pointer;
}
#?
Strings in JavaScript are immutable, which means that changing one string returns another.
A method of storing only one instance of any given string, which is immutable.
This means that there is only one copy of "answer"
inside the VM
Comparing strings for equality becomes very fast.
Two strings are the same if and only if their pointers are the same.
struct StringValue
{
std::string Value;
// extra data
// int RefCount;
// int GCFlags;
}
class Spasm
{
typedef std::unordered_set<StringValue> StringTable;
StringTable m_Strings;
};
Value Spasm::AllocateString(const char* s) {
auto current = m_Strings.find(s);
if (current != m_Strings.end())
{
return Value(ValueType::String, &(*current));
}
auto insert = m_Strings.insert(StringValue(s));
return Value(ValueType::String, &(*insert.second));
}