The document discusses the importance for Scala developers to understand the basics of the Java Virtual Machine (JVM) platform that Scala code runs on. It provides examples of Java bytecode produced from simple Scala code snippets to demonstrate how code is executed by the JVM. Key points made include that the JVM is a stack-based virtual machine that compiles source code to bytecode instructions, and that understanding the level below the code helps developers write more efficient, robust and performant code.
9. Origins of this talk
There is vast number of newcomers to Scala
world. Some percentage of those developers
have never programmed in Java before.
10. Origins of this talk
There is vast number of newcomers to Scala
world. Some percentage of those developers
have never programmed in Java before.
Do they have a notion of the platform they are
running their software on?
11. Origins of this talk
There is a significant number of Java
developers obsessed with all kind of APIs not
knowing a single thing about the platform that
they are using.
12. Origins of this talk
There is a significant number of Java
developers obsessed with all kind of APIs not
knowing a single thing about the platform that
they are using.
Those Java developers begin to move towards
other JVM languages (like Scala)
16. Should we even care?
Knowing one level below your level leads to:
17. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
18. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
● Ability to more accurately reason about the code
19. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
● Ability to more accurately reason about the code
● Improved performance
20. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
● Ability to more accurately reason about the code
● Improved performance
● Leaving the folklore beliefs towards scientific methods
21. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
● Ability to more accurately reason about the code
● Improved performance
● Leaving the folklore beliefs towards scientific methods
● Separation of dogmas and facts
22. Should we even care?
Knowing one level below your level leads to:
● Being a better engineer on general
● Ability to more accurately reason about the code
● Improved performance
● Leaving the folklore beliefs towards scientific methods
● Separation of dogmas and facts
● Efficiency in handling non-trivial errors and bugs
24. Should we even care?
We believe that as Scala developers
25. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform
26. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform in order to achieve
efficiency,
27. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform in order to achieve
efficiency, understanding,
28. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform in order to achieve
efficiency, understanding, robustness,
29. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform in order to achieve
efficiency, understanding, robustness,
determinism
30. Should we even care?
We believe that as Scala developers
we should at least understand basics of
the JVM platform in order to achieve
efficiency, understanding, robustness,
determinism and sanity.
45. Source: https://en.wikipedia.org/wiki/Strongtalk
“Work began in 1994 and they completed an
implementation in 1996. The company was bought by
Sun Microsystems in 1997, and the team got focused
on Java, releasing the HotSpot virtual machine,[3] and
work on Strongtalk stalled.”
51. So what exactly is bytecode?
Bytecode is an instruction set.
52. So what exactly is bytecode?
Bytecode is an instruction set.
Each instruction is 1-byte size code.
53. So what exactly is bytecode?
Bytecode is an instruction set.
Each instruction is 1-byte size code.
54. So what exactly is bytecode?
Bytecode is an instruction set.
Each instruction is 1-byte size code.
55. So what exactly is bytecode?
Bytecode is an instruction set.
Each instruction is 1-byte size code.
Thus there are only 255 opcodes possible.
56. So what exactly is bytecode?
Bytecode is an instruction set.
Each instruction is 1-byte size code.
Thus there are only 255 opcodes possible.
198 are currently in use, 54 are reserved for
future use, and 3 instructions are ‘reserved
opcodes’
58. JVM is Stack Machine
● No registers, no accumulators, stackpointers
59. JVM is Stack Machine
● No registers, no accumulators, stackpointers
● Why stack based? Two theories:
60. JVM is Stack Machine
● No registers, no accumulators, stackpointers
● Why stack based? Two theories:
1. Different platforms, no worries about
number of and sizes of registers
61. JVM is Stack Machine
● No registers, no accumulators, stackpointers
● Why stack based? Two theories:
1. Different platforms, no worries about
number of and sizes of registers
2. Compactness of bytecode
62. JVM is Stack Machine
● No registers, no accumulators, stackpointers
● Why stack based? Two theories:
1. Different platforms, no worries about
number of and sizes of registers
2. Compactness of bytecode
● Learning is easy
103. 0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1
0 this
1 i
Stack is 32-bit long.
Thus Long (64-bit) must
takes two slots on the
stack
104. 0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1
0 this
1 i
Stack is 32-bit long.
Thus Long (64-bit) must
takes two slots on the
stack
Some consider 32-bit stack as “the
biggest mistake Sun ever made"
115. 0: ldc #12
2: areturn
def f1 =
"Hello world!
#1 ….
... ...
#12 Hello world!
This is known as ‘constant pool’ and is
designed to hold constant values
(most of the time UTF Strings), that
can be referenced by #number.
116. 0: ldc #12 // String Hello world!
2: areturn
def f1 =
"Hello world!
#1 ….
... ...
#12 Hello world!
Our tools help us, so we tend not to
look at the number, but at the
comment provided
133. InvokeVirtual Invoke this method on the most derived method type available on given
object
InvokeSpecial Screw what virtual table tells you to do. Invoke method on exactly this class
provided
141. #CAFEBABE
“We used to go to lunch at a place called St Michael's Alley. According to local legend, in
the deep dark past, the Grateful Dead used to perform there before they made it big. It
was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry
died, they even put up a little Buddhist-esque shrine. When we used to go there, we
referred to the place as Cafe Dead. Somewhere along the line it was noticed that this was
a HEX number. I was re-vamping some file format code and needed a couple of magic
numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for
the object file format, and in grepping for 4 character hex words that fit after "CAFE" (it
seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn't
seem terribly important or destined to go anywhere but the trash-can of history. So
CAFEBABE became the class file format, and CAFEDEAD was the persistent object
format. But the persistent object facility went away, and along with it went the use of
CAFEDEAD - it was eventually replaced by RMI."
-- James Gosling
142. #CAFEBABE
“We used to go to lunch at a place called St Michael's Alley. According to local legend, in
the deep dark past, the Grateful Dead used to perform there before they made it big. It
was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry
died, they even put up a little Buddhist-esque shrine. When we used to go there, we
referred to the place as Cafe Dead. Somewhere along the line it was noticed that this was
a HEX number. I was re-vamping some file format code and needed a couple of magic
numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for
the object file format, and in grepping for 4 character hex words that fit after "CAFE" (it
seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn't
seem terribly important or destined to go anywhere but the trash-can of history. So
CAFEBABE became the class file format, and CAFEDEAD was the persistent object
format. But the persistent object facility went away, and along with it went the use of
CAFEDEAD - it was eventually replaced by RMI."
-- James Gosling
143. #CAFEBABE
“We used to go to lunch at a place called St Michael's Alley. According to local legend, in
the deep dark past, the Grateful Dead used to perform there before they made it big. It
was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry
died, they even put up a little Buddhist-esque shrine. When we used to go there, we
referred to the place as Cafe Dead. Somewhere along the line it was noticed that this was
a HEX number. I was re-vamping some file format code and needed a couple of magic
numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for
the object file format, and in grepping for 4 character hex words that fit after "CAFE" (it
seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn't
seem terribly important or destined to go anywhere but the trash-can of history. So
CAFEBABE became the class file format, and CAFEDEAD was the persistent object
format. But the persistent object facility went away, and along with it went the use of
CAFEDEAD - it was eventually replaced by RMI."
-- James Gosling
163. Question: How Scala’s String Interpolation is
implemented?
def f1(thing: String) = s"it's a ${thing}!!!"
44: invokevirtual #42 // Method scala/StringContext.s:(Lscala/collection/Seq;)47:
47: areturn
164. Question: How Scala’s String Interpolation is
implemented?
def f1(thing: String) = s"it's a ${thing}!!!"
0: new #12 // class scala/StringContext
3: dup
4: getstatic #18 // Field scala/Predef$.MODULE$:Lscala/Predef$;
7: iconst_2
8: anewarray #20 // class java/lang/String
11: dup
12: iconst_0
13: ldc #22 // String it's a
15: aastore
16: dup
String Interpolation triggers a rather more
complex bytecode compared to the one
produced by String concatenation.
However whether this causes any side
effects or performance issues, is a separate
question.
167. Question: How Lambdas are implemented?
class A17() { def f1 = () => "yeah" }
0: new #12 // class A17$$anonfun$f1$1
3: dup
4: aload_0
5: invokespecial #16 // Method A17$$anonfun$f1$1."<init>":(LA17;)V
8: areturn
168. Question: How Lambdas are implemented?
class A17() { def f1 = () => "yeah" }
public final class A17$$anonfun$1 extends scala.runtime.AbstractFunction0<java.
lang.String>
0: ldc #18 // String yeah
2: areturn
169. Question: How Lambdas are implemented?
class A17() { def f1 = () => "yeah" }
0: new #12 // class A17$$anonfun$f1$1
3: dup
4: aload_0
5: invokespecial #16 // Method A17$$anonfun$f1$1."<init>":(LA17;)V
8: areturn
Lambdas are implemented using inner
classes
170. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
171. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
172. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: new #26 // class A16$$anonfun$d2$1
4: dup
5: aload_0
6: invokespecial #30 // Method A16$$anonfun$d2$1."<init>":(LA16;)V
9: invokevirtual #32 // Method d1:(Lscala/Function0;)Ljava/lang/String;
12: pop
13: aload_0
14: new #34 // class A16$$anonfun$d2$2
17: dup
18: aload_0
19: invokespecial #35 // Method A16$$anonfun$d2$2."<init>":(LA16;)V
22: invokevirtual #32 // Method d1:(Lscala/Function0;)Ljava/lang/String;
25: pop
26: aload_0 ….
173. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: new #26 // class A16$$anonfun$d2$1
4: dup
5: aload_0
6: invokespecial #30 // Method A16$$anonfun$d2$1."<init>":(LA16;)V
9: invokevirtual #32 // Method d1:(Lscala/Function0;)Ljava/lang/String;
12: pop
13: aload_0
14: new #34 // class A16$$anonfun$d2$2
17: dup
18: aload_0
19: invokespecial #35 // Method A16$$anonfun$d2$2."<init>":(LA16;)V
22: invokevirtual #32 // Method d1:(Lscala/Function0;)Ljava/lang/String;
25: pop
26: aload_0 ….
Scala 2.12-M2
&
Scala 2.11.7 + flag
174. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: invokedynamic #52, 0 // InvokeDynamic #0:apply:()
Lscala/runtime/java8/JFunction0;
6: checkcast #19 // class scala/Function0
9: invokevirtual #54 // Method d1:(Lscala/Function0;)Ljava/lang/String;
12: pop
13: aload_0
14: invokedynamic #59, 0 // InvokeDynamic #1:apply:()
Lscala/runtime/java8/JFunction0;
19: checkcast #19 // class scala/Function0
22: invokevirtual #54 // Method d1:(Lscala/Function0;)Ljava/lang/String;
...
175. Question: Does it mean it produces anonymous inner
class for every tiny lambda?
def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: invokedynamic #52, 0 // InvokeDynamic #0:apply:()
Lscala/runtime/java8/JFunction0;
6: checkcast #19 // class scala/Function0
9: invokevirtual #54 // Method d1:(Lscala/Function0;)Ljava/lang/String;
12: pop
13: aload_0
14: invokedynamic #59, 0 // InvokeDynamic #1:apply:()
Lscala/runtime/java8/JFunction0;
19: checkcast #19 // class scala/Function0
22: invokevirtual #54 // Method d1:(Lscala/Function0;)Ljava/lang/String;
...
Inner classes are created for every lambda in
the system. However Scala 2.11.7 + flag and
Scala 2.12.x (by default) uses
InvokeDynamic to implement lambdas
(similar to how it is implemented in Java 8).
189. Homework:
1. How Inner class can access private field of
Outer class? How is that even possible?
190. Homework:
1. How Inner class can access private field of
Outer class? How is that even possible?
2. Lambdas, are they being given copies of
data by value or by reference?
191. Homework:
1. How Inner class can access private field of
Outer class? How is that even possible?
2. Lambdas, are they being given copies of
data by value or by reference?
3. How Nothing is transformed to bytecode?
192. Homework:
1. How Inner class can access private field of
Outer class? How is that even possible?
2. Lambdas, are they being given copies of
data by value or by reference?
3. How Nothing is transformed to bytecode?
4. How traits are implemented? (2.11.7 vs 2.12.
x)
241. G1 @Benchmark
@BenchmarkMode(Array(Mode.
AverageTime))
@OutputTimeUnit(TimeUnit.
MILLISECONDS)
def baseline() {
val result = fib(base +
rand.nextInt(randBase))
result + rand.nextInt()
}
@Benchmark
@BenchmarkMode(Array(Mode.AverageTime))
@OutputTimeUnit(TimeUnit.MILLISECONDS)
def testMethod(memory: Memory) {
for (i <- 0 until 100) memory.heap(i) =
new Array[Byte](1024 * 1024)
val result =
fib(base + rand.nextInt(randBase))
for (i <- 0 until 100) memory.heap(i) = null
result + rand.nextInt()
}
247. G1object MyBenchmarkBatch {
@State(Scope.Benchmark)
class Memory {
val heap = new Array[Array[Byte]](100)
}
}
@State(Scope.Thread)
class MyBenchmarkBatch {
@Param(Array("10"))
var offset: Int = _
var startPtr: Int = 0
var endPtr: Int = 0
def testMethod(memory: Memory) = ...
}
}
278. How to make thread dump
● jstack [-Flm] <pid>
● kill -3 <pid>
279. But what with GC logs?"org.openjdk.jmh.samples.MyBenchmarkLatency.baseline-jmh-worker-3"
#13 daemon prio=5 os_prio=0 tid=0x00007fb5c0243000 nid=0x22fb runnable
[0x00007fb5a6b50000]
java.lang.Thread.State: RUNNABLE at org.openjdk.jmh.samples.
MyBenchmarkLatency.fibInner$1(MyBenchmarkLatency.scala:56) at org.
openjdk.jmh.samples.MyBenchmarkLatency.fib(MyBenchmarkLatency.scala:
59) at org.openjdk.jmh.samples.MyBenchmarkLatency.baseline
(MyBenchmarkLatency.scala:35) at org.openjdk.jmh.samples.generated.
MyBenchmarkLatency_baseline.baseline_AverageTime
(MyBenchmarkLatency_baseline.java:124)
280. But what with GC logs?
"org.openjdk.jmh.samples.MyBenchmarkLatency.baseline-jmh-worker-3"
#13 daemon prio=5 os_prio=0 tid=0x00007fb5c0243000 nid=0x22fb runnable
[0x00007fb5a6b50000]
java.lang.Thread.State: RUNNABLE
at java.math.BigInteger.add(BigInteger.java:1315)
at java.math.BigInteger.add(BigInteger.java:1221)
at scala.math.BigInt.$plus(BigInt.scala:203)
at org.openjdk.jmh.samples.MyBenchmarkLatency.fibInner$1
(MyBenchmarkLatency.scala:56)
at org.openjdk.jmh.samples.MyBenchmarkLatency.fib
(MyBenchmarkLatency.scala:59)