hashCode引起的性能问题
一、场景
1、概述
业务中存在多张单下推生成另一张单的场景,而且单据量比较大,此时通过目标单获取源单时会比较慢。
2、Java类
数据类:
public class Data {
private Long tableId;
private Long billId;
public Data(Long tableId, Long billId) {
super();
this.tableId = tableId;
this.billId = billId;
}
public Long getTableId() {
return tableId;
}
public void setTableId(Long tableId) {
this.tableId = tableId;
}
public Long getBillId() {
return billId;
}
public void setBillId(Long billId) {
this.billId = billId;
}
@Override
public int hashCode() {
int hashCode = 31 * this.tableId.hashCode();
hashCode = 31 * hashCode + this.billId.hashCode();
return hashCode;
}
@Override
public boolean equals(Object obj) {
if(!(obj instanceof Data)) {
return false;
}
Data other = (Data) obj;
return this.tableId.equals(other.getTableId()) && this.billId.equals(other.getBillId());
}
}
关系类:
public class DataRelation {
private Data data;
private Data source;
public Data getData() {
return data;
}
public void setData(Data data) {
this.data = data;
}
public Data getSource() {
return source;
}
public void setSource(Data source) {
this.source = source;
}
@Override
public int hashCode() {
int hashcode = 31 * (this.source == null ? 1 : this.source.hashCode());
hashcode = 31 * (this.data == null ? 1 : this.data.hashCode());
return hashcode;
}
@Override
public boolean equals(Object obj) {
if(!(obj instanceof DataRelation)) {
return false;
}
DataRelation other = (DataRelation) obj;
return this.dataEquals(this.source, other.getSource()) && this.dataEquals(this.data, other.getData());
}
private boolean dataEquals(Data data, Data other) {
return data == null && other == null ? true : (data != null && other != null ? data.equals(other) : false);
}
}
模拟的数据查询返回记录类:
public class Row {
private Long srcTableId;
private Long srcBillId;
private Long tgtTableId;
private Long tgtBillId;
public Row(Long srcTableId, Long srcBillId, Long tgtTableId, Long tgtBillId) {
super();
this.srcTableId = srcTableId;
this.srcBillId = srcBillId;
this.tgtTableId = tgtTableId;
this.tgtBillId = tgtBillId;
}
public Long getSrcTableId() {
return srcTableId;
}
public void setSrcTableId(Long srcTableId) {
this.srcTableId = srcTableId;
}
public Long getSrcBillId() {
return srcBillId;
}
public void setSrcBillId(Long srcBillId) {
this.srcBillId = srcBillId;
}
public Long getTgtTableId() {
return tgtTableId;
}
public void setTgtTableId(Long tgtTableId) {
this.tgtTableId = tgtTableId;
}
public Long getTgtBillId() {
return tgtBillId;
}
public void setTgtBillId(Long tgtBillId) {
this.tgtBillId = tgtBillId;
}
}
查询关系工具类:
public class BillTrackUtil {
public static Set<DataRelation> getRelationByTgtBillId(Long tgtBillId) {
Set<DataRelation> result = new HashSet<>();
//查询源单
List<Row> rows = querySourceBill(tgtBillId);
Long start = System.currentTimeMillis();
for(Row row : rows) {
Data source = new Data(row.getSrcTableId(), row.getSrcBillId());
Data target = new Data(row.getTgtTableId(), row.getTgtBillId());
DataRelation relation = new DataRelation();
relation.setData(target);
relation.setSource(source);
result.add(relation);
}
Long end = System.currentTimeMillis();
System.out.println("cost: " + (end - start));
return result;
}
private static List<Row> querySourceBill(Long tgtBillId){
//模拟从数据库查询 SELECT FSTableId, FSBillId, FTTableId, FTBillId FROM XXX WHERE FTBillId = xxx
List<Row> rows = new ArrayList<>();
Long tgtTableId = 5001L;
Long srcTableId = 1001L;
Long srcBillId = 20000001L;
for(int i = 1; i <= 50000; i++) {
if(i % 1000 == 0) {
srcTableId++;
}
srcBillId ++;
Row row = new Row(srcTableId, srcBillId, tgtTableId, tgtBillId);
rows.add(row);
}
return rows;
}
}
3、运行
调用工具类获取某张单据的所有源单:
Long tgtBillId = 60000001L;
Set<DataRelation> ret = BillTrackUtil.getRelationByTgtBillId(tgtBillId);
...
上面模拟了5万张源单,耗时:66538ms,大概1分钟。
二、分析及处理
- 分析
使用jvisualvm分析耗时:

可以看到,主要消耗在HashSet的add方法。
- 原因
DataRelation类的hashCode方法有问题,此种场景下计算出来的hashCode都是相同的,导致哈希冲突。
- 处理方式
修改DataRelation类的hashCode方法为如下:
public int hashCode() {
int hashcode = 31 * (this.source == null ? 1 : this.source.hashCode());
hashcode = 31 * hashcode + (this.data == null ? 1 : this.data.hashCode());
return hashcode;
}
再次运行,仅耗时21ms。
三、哈希码计算方式
常用的哈希码计算方式如下:
- 官方模板
public int hashCode() {
//可变参数,内部就是 31乘加 链
return Objects.hash(field1, field2, field3);
}
- “31乘加”链
public int hashCode() {
//初始值非0即可,31是奇素数,且JVM会把 *31 优化成 (i<<5)-i,几乎无代价
int result = 31;
result = 31 * result + field1.hashCode();
result = 31 * result + (field2 == null ? 0 : field2.hashCode());
return result;
}