In the end of the day, the Python interpreter is C++ software
Most commonly, efficient Python code is written in C++ and then wrapped into Python
Put it simply, the JVM too is C++ software
Can Python code be written in Java and then wrapped into Python?
Apparently yes, via several ad-hoc bridging technologies:
Why exactly JPype?
Ensure you have a JVM installed on your system
JAVA_HOME
environment variable set to the JVM installation directoryEnsure your compiled Java code is available as a .jar
file
/path/to/my.jar
Install the JPype package via pip install JPype1
You first need JPype to start a JVM instance in your Python process
JAVA_HOME
environment variableimport jpype
# start the JVM
jpype.startJVM(classpath=["/path/to/my.jar"])
Once the JVM is started, one can import Java classes and call their methods as if they were Python objects
import jpype.imports # this is necessary to import Java classes
from java.lang import System # import the java.lang.System class
System.out.println("Hello World!")
Overview on the official documentation
Java classes are presented wherever possible similar to Python classes
the only major difference is that Java classes and objects are closed and cannot be modified
Overview on the official documentation
Java exceptions extend from Python exceptions
Java exceptions can be dealt with in the same way as Python native exceptions
try
-except
blocksJException
serves as the base class for all Java exceptions
Overview on the official documentation
most Python primitives directly map into Java primitives
however, Python does not have the same primitive types…
… hence, explicit casts may be needed in some cases
each primitive Java type is exposed in JPype (jpype.JBoolean
, .JByte
, .JChar
, .JShort
, .JInt
, .JLong
, .JFloat
, .JDouble
).
Overview on the official documentation
Java strings are similar to Python strings
they are both immutable and produce a new string when altered
most operations can use Java strings in place of Python strings
when comparing or using strings as dictionary keys, all JString
objects should be converted to Python
Overview on the official documentation
Java arrays are mapped to Python lists
more precisely, they operate like Python lists, but they are fixed in size
reading a slice from a Java array returns a view of the array, not a copy
passing a slide of a Python list to Java will create a copy of the sub-list
Overview on the official documentation
Java collections are overloaded with Python syntax where possible
Java’s Iterable
s are mapped to Python iterables by overriding the __iter__
method
Java’s Collection
s are mapped to Python containers by overriding __len__
Java’s Map
s support Python’s dictionaries syntax by overriding __getitem__
and __setitem__
Java’s List
s support Python’s lists syntax by overriding __getitem__
and __setitem__
Overview on the official documentation
Java interfaces can be implemented in Python, via JPype’s decorators
Java’s open / abstract classes cannot be extended in Python
Python lambda expressions can be cast’d to Java’s functional interfaces
Overview on the official documentation
none, there is no way to convert
explicit (E), JPype can convert the desired type, but only explicitly via casting
implicit (I), JPype will convert as needed
exact (X), like implicit, but takes priority in overload selection
Consider the following example of Python code with JPype:
import jpype.imports
from java.lang import System
System.out.println(1)
System.out.println(2.0)
System.out.println('A')
Which overload of System.out.println
is called among the many admissible ones?
1
is convertible to Java’s int
, long
, and short
int
is the exact match2.0
is convertible to Java’s float
and double
double
is the exact match'A'
is convertible to Java’s String
and char
String
is the exact matchConsider the following example of Python code with JPype:
import jpype
csv = jpype.JPackage("io.github.gciatto.csv.Csv")
csv.headerOf(["filed", "another field"])
This would raise the following error:
TypeError: Ambiguous overloads found for io.github.gciatto.csv.Csv.headerOf(list) between:
public static final io.github.gciatto.csv.Header io.github.gciatto.csv.Csv.headerOf(java.lang.Iterable)
public static final io.github.gciatto.csv.Header io.github.gciatto.csv.Csv.headerOf(java.lang.String[])
list
is convertible to both Java’s Iterable
and String[]
To solve this issue, one can explicitly cast the Python list
to the desired Java type:
import jpype
import jpype.imports
from java.lang import Iterable as JIterable
csv = jpype.JClass("io.github.gciatto.csv.Csv")
csv.headerOf(JIterable@["field", "another field"])
# returns Header("field", "another field")
One may customise the behaviour of Java types in Python by providing custom implementations for them
@JImplementationFor
decoratorIn that case the special method __jclass_init__
is called on the custom implementation, just once, to configure the class
In type hierarchies, implementations provided for superclasses are inherited by subclasses
Consider for instance the following customisations, allowing to use Java collections with Python syntax
from typing import Iterable, Sequence
@jpype.JImplementationFor("java.lang.Iterable")
class _JIterable:
def __jclass_init__(self):
Iterable.register(self) # makes this class a subtype of Iterable, to speed up isinstance checks
def __iter__(self):
return self.iterator()
@jpype.JImplementationFor("java.util.Collection")
class _JCollection:
def __len__(self):
return self.size() # supports "len(coll)" syntax
def __delitem__(self, i):
return self.remove(i) # supports "del coll[i]" syntax
def __contains__(self, i):
return self.contains(i) # supports "i in coll" syntax
# __iter__ is inherited from _JIterable
# because in Java: Collection extends Iterable
@jpype.JImplementationFor('java.util.List')
class _JList(object):
def __jclass_init__(self):
Sequence.register(self) # makes this class a subtype of Sequence, to speed up isinstance checks
def __getitem__(self, ndx):
return self.get(ndx) # supports "list[i]" syntax
def append(self, obj):
return self.add(obj) # supports "list.append(obj)" syntax
# __len__, __delitem__, __contains__, __iter__ are inherited from _JCollection
this is taken directly from JPype’s codebase
The code wrapped via JPype is not Pythonic by default
It is important to make the wrapped code as Pythonic as possible
io.github.gciatto.csv.Csv
$\rightarrow$ jcsv.Csv
snake_case
instead of camelCase
__len__
for java.util.Collection
__getitem__
for java.util.List
All such refinements can be done in JPype via customisations of the Java types
For all public types in the wrapped Java library:
jcsv
package (pt. 1)The jcsv
package is a Pythonic wrapper for our JVM-based io.github.gciatto.csv
library
Java’s type definition are brought to Python in jcsv/__init__.py
:
import jpype
import jpype.imports
from java.lang import Iterable as JIterable
_csv = jpype.JPackage("io.github.gciatto.csv")
Table = _csv.Table
Row = _csv.Row
Record = _csv.Record
Header = _csv.Header
Formatter = _csv.Formatter
Parser = _csv.Parser
Configuration = _csv.Configuration
Csv = _csv.Csv
CsvJvm = _csv.CsvJvm
making it possible to write the following code on the user side:
from jcsv import Table, Record, Header
jcsv
package (pt. 2)Parsing and formatting operations are mapped straightforwardly to Python functions:
# jcsv/__init__.py
def parse_csv_string(string, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return Csv.parseAsCSV(string, separator, delimiter, comment)
def parse_csv_file(path, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return CsvJvm.parseCsvFile(str(path), separator, delimiter, comment)
def format_as_csv(rows, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return Csv.formatAsCSV(JIterable@rows, separator, delimiter, comment)
jcsv
package (pt. 3)Ad-hoc factory method is provided for building Header
instances:
# jcsv/__init__.py
from jcsv.python import iterable_or_varargs
def header(*args):
if len(args) == 1 and isinstance(args[0], int):
return Csv.anonymousHeader(args[0])
return iterable_or_varargs(args, lambda xs: Csv.headerOf(JIterable@map(str, xs)))
making it possible to write the following code on the user side:
import jcsv
header1 = jcsv.header("column1", "column2", "column3")
header2 = jcsv.header(3) # anonymous header with 3 columns
columns = (f"column{i}" for i in range(1, 4)) # generator expression
header3 = jcsv.header(columns) # same as header1, but passing an interable
Function iterable_or_varargs
aims at simulating multiple overloads:
# jcsv/python.py
from typing import Iterable
def iterable_or_varargs(args, f):
assert isinstance(args, Iterable)
if len(args) == 1:
item = args[0]
if isinstance(item, Iterable):
return f(item)
else:
return f([item])
else:
return f(args)
jcsv
package (pt. 4)Ad-hoc factory method is provided for building Record
instances:
# jcsv/__init__.py
def record(header, *args):
return iterable_or_varargs(args, lambda xs: Csv.recordOf(header, JIterable@map(str, xs)))
Ad-hoc factory method is provided for building Table
instances:
# jcsv/__init__.py
def __ensure_header(h):
return h if isinstance(h, Header) else header(h)
def __ensure_record(r, h):
return r if isinstance(r, Record) else record(h, r)
def table(header, *args):
header = __ensure_header(header)
args = [__ensure_record(row, header) for row in args]
return iterable_or_varargs(args, lambda xs: Csv.tableOf(header, JIterable@xs))
jcsv
package (pt. 5)The Row
class is customised to make it more Pythonic:
# jcsv/__init__.py
@jpype.JImplementationFor("io.github.gciatto.csv.Row")
class _Row:
def __len__(self):
return self.getSize()
def __getitem__(self, item):
if isinstance(item, int) and item < 0:
item = len(self) + item
try:
return self.get(item)
except _java.IndexOutOfBoundsException as e:
raise IndexError(f"index {item} out of range") from e
@property
def size(self):
return len(self)
len(row)
instead of row.getSize()
row[i]
instead of row.get(i)
row[-i]
instead of row.get(row.getSize() - i - 1)
IndexError
be raised instead of IndexOutOfBoundsException
row.size
instead of row.getSize()
jcsv
package (pt. 6)The Header
shall inherit all customisation for Row
, plus the following ones:
@jpype.JImplementationFor("io.github.gciatto.csv.Header")
class _Header:
@property
def columns(self):
return [str(c) for c in self.getColumns()]
def __contains__(self, item):
return self.contains(item)
def index_of(self, column):
return self.indexOf(column)
header.columns
instead of header.getColumns()
column in header
instead of header.contains(column)
header.index_of(column)
instead of header.indexOf(column)
jcsv
package (pt. 7)The Record
shall inherit all customisation for Row
, plus the following ones:
@jpype.JImplementationFor("io.github.gciatto.csv.Record")
class _Record:
@property
def header(self):
return self.getHeader()
@property
def values(self):
return [str(v) for v in self.getValues()]
def __contains__(self, item):
return self.contains(item)
record.header
instead of record.getHeader()
record.values
instead of record.getValues()
value in record
instead of record.contains(value)
jcsv
package (pt. 8)The Table
class is customised too, to make it more Pythonic:
@jpype.JImplementationFor("io.github.gciatto.csv.Table")
class _Table:
@property
def header(self):
return self.getHeader()
def __len__(self):
return self.getSize()
def __getitem__(self, item):
if isinstance(item, int) and item < 0:
item = len(self) + item
try:
return self.get(item)
except _java.IndexOutOfBoundsException as e:
raise IndexError(f"index {item} out of range") from e
@property
def records(self):
return self.getRecords()
@property
def size(self):
return len(self)
table.header
instead of table.getHeader()
len(table)
instead of table.getSize()
table[i]
instead of table.get(i)
table[-i]
instead of table.get(table.getSize() - i - 1)
record in table
instead of table.contains(record)
table.records
instead of table.getRecords()
.jar
s in JPype projects (pt. 1)csv-python/
├── build.gradle.kts # this is where the generation of csv.jar is automated
├── jcsv
│ ├── __init__.py
│ ├── jvm
│ │ ├── __init__.py # this is where JPype is loaded
│ │ └── csv.jar # this the Fat-JAR of the JVM-based library
│ └── python.py
├── requirements.txt
└── test
├── __init__.py
├── test_parsing.py
└── test_python_api.py
We need to ensure that the JVM-based library is available on the system where jcsv
is installed
The build.gradle.kts
file automates the generation of the csv.jar
file
jcsv/jvm
directoryThe jcsv/jvm/__init__.py
file loads JPype and the csv.jar
file
.jar
s in JPype projects (pt. 2)Snippet from the build.gradle.kts
:
tasks.create<Copy>("createCoreJar") {
group = "Python"
val shadowJar by project(":csv-core").tasks.getting(Jar::class)
dependsOn(shadowJar)
from(shadowJar.archiveFile) {
rename(".*?\\.jar", "csv.jar")
}
into(projectDir.resolve("jcsv/jvm"))
}
Content of the jcsv/jvm/__init__.py
file:
import jpype
from pathlib import Path
# the directory where csv.jar is placed
CLASSPATH = Path(__file__).parent
# the list of all .jar files in CLASSPATH
JARS = [str(j.resolve()) for j in CLASSPATH.glob('*.jar')]
jpype.startJVM(classpath=JARS)
Important line in jcsv/__init__.py
:
import jcsv.jvm
this is forcing the startup of the JVM with the correct classpath whenever someone is using the jcsv
module
We need to ensure that some JVM is available on the system where jcsv
is installed
Notice that the JVM is available as a Python dependency too:
This means that the JVM can be automatically downloaded and installed via pip
:
pip install jdk4py
… or added as a dependency to the requirements.txt
file:
JPype1==1.4.1
jdk4py==17.0.7.0
so, one may simply need to configure JPype to use that JVM:
# jcsv/jvm/__init__.py
import jpype, sys
from jdk4py import JAVA_HOME
def jvm_lib_file_names():
if sys.platform == "win32":
return {"jvm.dll"}
elif sys.platform == "darwin":
return {"libjli.dylib"}
else:
return {"libjvm.so"}
def jvmlib():
for name in __jvm_lib_file_names():
for path in JAVA_HOME.glob(f"**/{name}"):
if path.exists:
return str(path)
return None
jpype.startJVM(jvmpath=jvmlib())
Unit tests are essential to ensure the correctness of the Pythonic API
Consider for instance tests in:
test/test_parsing.py
test/test_python_api.py
It is important to test all the costumisations and factory methods
giovanni.ciatto@unibo.it
Compiled on: 2024-02-20